HomePressAnyscale Announces Partnership with Google Cloud to Accelerate AI Production at Scale

Anyscale Announces Partnership with Google Cloud to Accelerate AI Production at Scale

Collaboration gives AI/ML teams a distributed operating system for AI to build and run distributed AI workloads.

Las Vegas, NV – April 9, 2025 — Anyscale today announced a partnership with Google Cloud to deliver a deeply integrated and optimized Ray experience on Google Kubernetes Engine (GKE). The collaboration is designed to meet the growing demand from developers to build and run AI/ML applications at scale. 

As part of this relationship, Google Cloud customers will have access to Anyscale RayTurbo, the optimized Ray runtime that delivers faster task execution, higher throughput, and improved GPU/TPU utilization. RayTurbo will be supported natively on GKE, making it easier for AI/ML engineers to build, run, and scale AI workloads using Ray with Anyscale—underpinned with the reliability and cluster management of GKE.

“Our mission is to make building and running AI applications as easy as writing Python,” said Keerti Melkote, CEO at Anyscale. “By partnering with Google Cloud, we’re unlocking a faster, simpler path for teams to scale from local development to large-scale production—with the tools and infrastructure they already use.”

“We’re seeing strong momentum around Ray as the computing engine for AI,” said Gabe Monroy, VP/GM Cloud Runtimes at Google Cloud. “This deep integration with Anyscale RayTurbo, combined with GKE’s advanced cluster orchestration and autoscaling, enables our customers to build and run AI applications with the scale, performance, and flexibility they need.”

Ray: The AI Compute Engine 

AI/ML engineers today want to build and run AI workloads at scale faster—but scale often comes at the cost of developer velocity and flexibility, with engineering teams spending time managing distributed infrastructure. AI workloads have highly dynamic and variable compute patterns, leading to over-provisioned GPUs, brittle scaling logic, and challenging DevOps overhead. 

The AI community has adopted Ray as the industry’s compute engine for AI—with 1000s of organizations from Coinbase and Attentive to Uber using it to build, train, and run models in production. Ray provides a flexible and highly efficient framework for distributing Python workloads across heterogenous compute—including CPUs, GPUs, and TPUs—scaling to  1000s of nodes with throughput exceeding millions of tasks per second. 

While Ray’s fine-grained parallelism accelerates AI workloads, its simple, Pythonic APIs let developers express distributed workloads for training, inference, and data processing naturally in Python.

Kubernetes + Ray = The Distributed OS for AI

Kubernetes is a natural companion to Ray, providing orchestration, autoscaling, and resource isolation of Ray workloads across heterogenous infrastructure. Google Cloud has long provided a strong developer experience for running Ray with Anyscale on GKE, with first-class GPU support, low latency networking, and a robust cluster autoscaler.

Today’s partnership announcement is set to deliver a first-class Ray experience that together serves as the distributed operating system for AI. As part of this collaboration, GKE users will be gaining access to Anyscale RayTurbo, Anyscale’s optimized Ray runtime, which provides faster task execution, higher throughput, and improved GPU utilization.

AI teams will be able to start up Anyscale RayTurbo clusters on GKE in seconds, run distributed training jobs, and power sophisticated AI applications efficiently at massive scale. 

Key Components of the Partnership

  • Optimized Ray on GKE: Anyscale RayTurbo with multi-modal data processing up to 4.5X faster, up to 54% higher QPS serving, up to 50% fewer nodes for online model serving, reduces cost by as much as 60%.

  • GKE Enhancements: Google Cloud will be adding differentiated Kubernetes capabilities that improve Ray’s performance at scale, drive higher efficiency with dynamic resource allocation, and adds topology-aware scheduling.

Together, Anyscale RayTurbo on GKE gives AI/ML teams a distributed operating system for AI—with the performance of Ray, the orchestration of Kubernetes, and the scalability of Google Cloud. To learn more and try Anyscale RayTurbo on GKE, sign up here.

About Anyscale

Anyscale, founded by the creators of Ray, is pioneering a new category of AI infrastructure. Our platform empowers AI teams to easily build and scale AI workloads—from multi-modal data processing to training and inference—optimized for modern accelerators. With Anyscale, teams get the best Ray experience and can rapidly scale sophisticated AI applications that transform customer experiences and power the future of autonomy.