Scale RL post-training from a single node to thousands of GPUs with Ray, the engine for veRL, skyRL and more.
Run the full post-training lifecycle on Ray, the world’s most widely adopted AI compute engine

Coordinate multiple frameworks running across CPU and GPU hardware with simple Python APIs.
veRL, SkyRL, OpenRLHF, and other leading RL libraries are built on Ray, no rewiring required.
Ray works seamlessly with vLLM, SGLang, and Megatron to keep rollout generation fast and GPUs utilized.
We built custom training infrastructure leveraging PyTorch and Ray to power asynchronous reinforcement learning at scale.”

We built custom training infrastructure leveraging PyTorch and Ray to power asynchronous reinforcement learning at scale.”

Token generation efficiency for model trained compared to Frontier models
Ray on Anyscale abstracts RL infrastructure complexity so you can focus on development
Run veRL, SkyRL, OpenRLHF, NeMo-RL, and other leading RL libraries across any cluster size.
Native support for vLLM and SGLang — the inference engines that power modern RL rollout generation.
Optimize placement of training and inference workers across complex hardware topologies (in preview).
Coordinate multi-step environments, tool use, and reward computation across complex agent trajectories.
Eliminate fragmented tooling with data prep, fine-tuning, RL, and online inference on a single runtime.
Profile CPU/GPU performance in distributed data, train or serve runs with persistent logs and dashboards.
Deploy advanced AI applications without growing operational complexity with Ray on Anyscale.
Transform complex data modalities such as video, images, voice, text, and more into AI-ready datasets
Scale existing training code from one machine to thousands of GPUs with intuitive scaling configs
Serve one or many models and Python applications working together as a single API endpoint