The LLM Infrastructure Trusted by CohereOpenAIUber

Ray is the most popular open source framework for scaling and productionizing AI workloads. From Generative AI and LLMs to computer vision, Ray powers the world’s most ambitious AI workloads.

The Leader in Performance


Higher throughput*


Lower cost*


Time to scale to 1000 nodes


World record for shuffling 100TB*

Trusted by the world’s leading AI teams

From ChatGPT to Spotify recommendations to Uber ETA predictions, see how innovators are succeeding with Ray and Anyscale.


"At OpenAI, we are tackling some of the world’s most complex and demanding computational problems. Ray powers our solutions to the thorniest of these problems and allows us to iterate at scale much faster than we could before. As an example, we use Ray to train our largest models, including ChatGPT."

Greg Brockman

Co-founder and President


"We chose Ray as the unified compute backend for our machine learning and deep learning platform because it has allowed us to significantly improve performance and fault tolerance, while also reducing the complexity of our technology stack. Ray has brought significant value to our business, and has enabled us to rapidly pretrain, fine-tune and evaluate our LLMs."

Min Cai

Distinguished Engineer

AWS Logo

"One of the biggest problems that Ray helped us resolve is improving scalability, latency, and cost-efficiency of very large workloads. We were able to improve the scalability by an order of magnitude, reduce the latency by over 90%, and improve the cost efficiency by over 90%. It was financially infeasible for us to approach that problem with any other distributed compute framework that we tried."

Patrick Ames

Principal Engineer

cohere logo

"Ray has profoundly simplified the way we write scalable distributed programs for Cohere’s LLM pipelines. Its intuitive design allows us to manage complex workloads and train our models across thousands of TPUs with little to no overhead."

Siddhartha Kamalakara

Machine Learning Engineer

antgroup logo

"Ant Group has deployed Ray Serve on 240,000 cores for model serving, which has increased by about 3.5 times compared to last year. The peak throughput during Double 11, the largest online shopping day in the world, was 1.37 million transactions per second. Ray allowed us to scale elastically to handle this load and to deploy ensembles of models in a fault tolerant manner."

Tengwei Cai

Staff Engineer

Samsara Logo

"We use Ray to run a number of AI workloads at Samsara. Since implementing the platform, we’ve been able to scale the training of our deep learning models to hundreds of millions of inputs, and accelerate deployment while cutting inference costs by 50% - we even use Ray to drive model evaluation on our IoT devices! Ray's performance, resource efficiency, and flexibility made it a great choice for supporting our evolving AI requirements."

Evan Welbourne

Head of AI and Data

Anyscale is the AI Application Platform for developing, running, and scaling AI.

HP Panel Anyscale Endpoints
HP Panel Anyscale Endpoints

Anyscale Endpoints

Want to add an open source LLM to your app? Get started with Llama-2 in minutes.

Need to train your own Large Language Model? Run inference at scale at a lower price point.

Want to personalize an LLM to your business, securely? Fine tune open source models.

Need to run with more privacy, control, and customizability? Deploy Endpoints in your cloud.

HP Panel Platform
HP Panel Platform

Anyscale Platform

Slash costs
Train and deploy your models at a fraction of the cost. Leverage the most recent advances from the open source community instantaneously. Transition to cheaper spot instances and run across multiple regions and multiple clouds. Autoscale rapidly to handle bursty workloads while minimizing steady-state costs. Scale CPU compute and GPU compute elastically and independently.

Extend your platform capabilities
Leverage the same LLM and generative AI capabilities previously only available to leaders like OpenAI and Uber, all in your cloud account. Develop interactively at scale. Bring models to production in hours instead of weeks. Enable multi-GPU and multi-node models with a single click.

Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.