Scalable reinforcement learning in production

Tackle reinforcement learning


Finding success with reinforcement learning (RL) is not easy. RL tooling hasn’t historically kept pace with the demands and constraints of those wanting to use it. Even with ready-made frameworks, failure is common when crossing over into production due to their rigidity, lack of speed, limited ecosystems, and operational overhead.


Anyscale helps you go beyond existing reinforcement limitations with Ray and RLlib, an open source, easy-to-use, distributed computing library for Python that can:

  • Handle complex, heterogeneous applications
  • Includes over 25 state-of-the-art algorithms that can be converted into
  • TensorFlow and Pytorch,
  • Covers subcategories including model-based, model-free, and Offline RL,
  • Almost all RLlib algorithms can learn in multi-agent mode.

RLlib is the best way to do reinforcement learning


An expansive ecosystem

Existing RL solutions force developers to switch frameworks or disjointly glue RL systems with other tools for tuning, serving, and monitoring. Avoid that with the Ray ecosystem. Find the perfect set of hyperparameters using Ray Tune or serve your trained model in a massively parallel way with Ray Serve.


Production readiness

Iterate quickly without needing to rewrite again to go to production or scale to a large cluster.


Environments to meet your needs

RLlib works with several types of environments, including OpenAI Gym, user-defined, multi-agent, and batched environments.


Offline RL and imitation learning/ behavior cloning:

RLlib’s comes with several offline RL algorithms (e.g., CQL, MARWIL, and DQfD), allowing you to either purely behavior-clone your existing system or learn how to further improve it.


Speed and efficiency

Experience fast training and policy evaluation with lower overhead than most other algorithms.


Distributed RL, simplified

RLlib algorithm implementations (such as our “APPO” or “APEX”) allow you to run workloads on hundreds of CPUs, GPUs, or nodes in parallel to speed up learning.


External simulators

RLlib supports an external environment API and comes with a pluggable, off-the-shelve client/ server setup to run hundreds of independent simulators on the “outside,” connecting to a central RLlib Policy-Server that learns and serves actionas.


Unmatched algorithm selection

With more than double the amount of any other library, RLlib allows teams to quickly iterate and test SOTA algorithms so you can get to the best options faster without having to worry about building and maintaining your own.

Iterate and move to production fast with RLlib and Anyscale

Leading organizations today are already using reinforcement learning to create next-gen recommendation systems, create better gaming experiences, optimize industrial environments and more thanks to RLlib and Anyscale.


Supply chains are critical. Kinks or breaks in the chain could spell disaster for manufacturers and buyers alike. This is why Dow has decided to double down in its digitization efforts, which include the increased use of machine learning, advanced modeling techniques, robotics, and more. One such project in the ...


RL can help learn agent behaviors and policies and then calibrate the agent composition with real world data — all the speed and scale of the business.

Already using open source Ray?

Get started your existing workloads to Anyscale with no code changes. Experience the magic of infinite scale at your fingertips.