News

User Story

Engineering

Culture

Julian Forero

Marwan Sarieddine

Screenshot 2024-05-06 at 3.11.06 PM

Ray has become the compute framework that unifies the AI ecosystem and its being used by industry leaders to build and run AI at scale. That’s why we’re excited to launch the Ray Foundations Certification. This credential is designed to validate your ability to work with Ray’s core architecture, primitives, and libraries.

Tile image Op 28

Ray distributed compute runtime certification and skills badge

Announcing the Ray Foundations Certification

Richard Liaw

Richard Liaw headshot

Alexey Kudinkin

Justin Hsu

Matthew Owen

Balaji Veeramani

Praveen Gorthy

Multimodal AI workloads are pushing the boundaries of today’s infrastructure, demanding systems that can handle multimodal data (text, images, audio, video), high-volume throughput pipelines, and CPU and GPU scheduling at scale. Ray Data is the fastest and most efficient way to process these workloads at scale. 


ray data benchmarks multimodal AI

Benchmarking Multimodal AI Workloads on Ray Data

Robert Nishihara

Philipp Moritz

Tinker + Ray Text to SQL model

Fine-tuning a Text-to-SQL Model with Tinker and Ray

Seiji Eicher

Justin Ji

Gene Su

Kourosh Hakhamaneshi

Ray Serve is a scalable model serving library that optimizes traffic routing, scaling, and caching. For LLMs and MoE models, it leverages prefix caching and cache-aware routing to reduce latency, avoid redundant computation, and improve efficiency in multi-turn conversations and distributed inference.

Ray Serve: Custom Routing - Hero Image

Ray Serve: Reduce LLM Inference Latency by 60% with Custom Request Routing

Nikita Vemuri

Mengjin Yan

Alan Guo

A more scalable, cost-efficient way to capture and analyze Ray task events via managed dashboards. 

Task Dashboard Header Image

Ray Task Monitoring & Persistent Dashboards in Anyscale

Ray Task Monitoring at Scale: Announcing Persistence for +10k Tasks on Anyscale

Sumanth Hegde

Tyler Griggs

Eric Tang

Massively parallel agentic simulations with Ray image

Massively Parallel Agentic Simulations with Ray

The Anyscale Team

Build and run even the most complex data and AI workloads with Anyscale, creators of Ray, now available via Azure Marketplace. 

Anyscale Microsoft Azure Marketplace

Anyscale on Microsoft Azure Marketplace

Anyscale now available on Microsoft Azure Marketplace

The Google GKE Team

Deploy DeepSeek‑R1 with vLLM and Ray Serve on Kubernetes. Self-deploy or get a managed experience with Ray on Anyscale. 


Blog - vLLM GKE Ray Serve DeepSeek

DeepSeek on Kubernetes with vLLM and Ray Serve on Anyscale

Deploy DeepSeek‑R1 with vLLM and Ray Serve on Kubernetes

The KubeRay team

Introducing KubeRay v1.4, featuring the KubeRay API server V2 and the Ray Autoscaler V2, Service Level Indicator (SLI) metrics, and more.


Kuberay blog hero

Introducing KubeRay v1.4

Explore a technical comparison of leading Reinforcement Learning (RL) libraries for LLMs from Ray. This guide analyzes frameworks like TRL, Verl, and RAGEN to help developers choose the best tools for RLHF, reasoning, and agentic AI.

The architecture of a Reinforcement Learning (RL) library is split into two primary components: Generation and Training. During the generation phase, an LLM Engine performs multi-turn rollouts within an environment to produce data and reward signals. This output is then fed into the training phase to update the model's parameters. This process forms a feedback loop, where the progressively improved model generates the next iteration of data for continuous refinement.

Open Source RL Libraries for LLMs Graph

Open Source RL Libraries for LLMs

Weixin Astra Team

See how the Tencent Weixin team implemented Ray and Kubernetes to build ultra-large-scale distributed systems with Ray.

Figure 24: TFCC inference runtime

How Tencent’s Weixin AI Team Deployed Ray

Large-Scale Deployment of Ray in Tencent’s Weixin AI Infrastructure

Unstructured data growth and GPU-centric data processing expose limits of big data compute frameworks such as Spark. Catch up on the history of large scale data processing and see the trends driving the evolution of Ray.

Your Data and AI Frameworks Evolved – What About Your Compute Framework? thumbnail

AI is Evolving – Can Your Compute Framework Keep Up?

Your Data and AI Frameworks Evolved – What About Your Distributed Compute Framework?

opensource stack blog thumbnail

Components of an Open Source AI Compute Tech Stack

An Open Source Stack for AI Compute: Kubernetes + Ray + PyTorch + vLLM

Google + Anyscale

Google Cloud Integrates Anyscale's RayTurbo with GKE

Simplifying AI Development at Scale: Google Cloud Integrates Anyscale's RayTurbo with GKE

At Ray Summit 2024, we brought together AI experts and practitioners to dive into the future of distributed AI and scalable machine learning.

image3

Ray Summit 2024: Breaking Through the AI Complexity Wall

This section is used to order the "Types" and "Tags" that show up for filters on the Blog Index

Types

Products / Libraries

Blog

Join us at Ray Summit 2025 in San Francisco, Nov 3 -5.

Powered by Ray, Anyscale empowers AI builders to run and scale all ML and AI workloads on any cloud and on-prem.

Anyscale

Blogs