Scalable reinforcement learning in production

Tackle reinforcement learning

Challenges

Finding success with reinforcement learning (RL) is not easy. RL tooling hasn’t historically kept pace with the demands and constraints of those wanting to use it. Even with ready-made frameworks, failure is common when crossing over into production due to their rigidity, lack of speed, limited ecosystems, and operational overhead.

Solutions

Anyscale helps you go beyond existing reinforcement limitations with Ray and RLlib, an open source, easy-to-use, distributed computing library for Python that can:

  • Handle complex, heterogeneous applications
  • Includes over 25 state-of-the-art algorithms that can be converted into
  • TensorFlow and Pytorch,
  • Covers subcategories including model-based, model-free, and Offline RL,
  • Almost all RLlib algorithms can learn in multi-agent mode.

RLlib is the best way to do reinforcement learning

ecosystem

An expansive ecosystem

Existing RL solutions force developers to switch frameworks or disjointly glue RL systems with other tools for tuning, serving, and monitoring. Avoid that with the Ray ecosystem. Find the perfect set of hyperparameters using Ray Tune or serve your trained model in a massively parallel way with Ray Serve.

production-readiness

Production readiness

Iterate quickly without needing to rewrite again to go to production or scale to a large cluster.

environments

Environments to meet your needs

RLlib works with several types of environments, including OpenAI Gym, user-defined, multi-agent, and batched environments.

learning

Offline RL and imitation learning/ behavior cloning:

RLlib’s comes with several offline RL algorithms (e.g., CQL, MARWIL, and DQfD), allowing you to either purely behavior-clone your existing system or learn how to further improve it.

speed-efficiency

Speed and efficiency

Experience fast training and policy evaluation with lower overhead than most other algorithms.

distributed-rl

Distributed RL, simplified

RLlib algorithm implementations (such as our “APPO” or “APEX”) allow you to run workloads on hundreds of CPUs, GPUs, or nodes in parallel to speed up learning.

simulators

External simulators

RLlib supports an external environment API and comes with a pluggable, off-the-shelve client/ server setup to run hundreds of independent simulators on the “outside,” connecting to a central RLlib Policy-Server that learns and serves actionas.

unmatched

Unmatched algorithm selection

With more than double the amount of any other library, RLlib allows teams to quickly iterate and test SOTA algorithms so you can get to the best options faster without having to worry about building and maintaining your own.

Iterate and move to production fast with RLlib and Anyscale

Leading organizations today are already using reinforcement learning to create next-gen recommendation systems, create better gaming experiences, optimize industrial environments and more thanks to RLlib and Anyscale.

dow-normal

Supply chains are critical. Kinks or breaks in the chain could spell disaster for manufacturers and buyers alike. This is why Dow has decided to double down in its digitization efforts, which include the increased use of machine learning, advanced modeling techniques, robotics, and more. One such project in the ...

jpmorgan

RL can help learn agent behaviors and policies and then calibrate the agent composition with real world data — all the speed and scale of the business.

Already using open source Ray?

Get started your existing workloads to Anyscale with no code changes. Experience the magic of infinite scale at your fingertips.