Home BlogBlog Detail

How Torc hit 90% GPU utilization and other stories on scaling AI with Ray from Discord, Cubist, and Coinbase

By Katarina Stanley | June 10, 2026

Ray Day: NYC marked the latest stop on Anyscale's 2026 Ray on the Road, a series built to bring the Ray community together in the cities where the work is happening.

For one day in New York, practitioners, Ray creators, and the Anyscale team gathered at Convene, One Liberty Plaza to share stories on building production AI, work through hands-on technical content, and dig into what running real AI workloads actually demands.

LinkFrom the community: Ray user talks

Christian Stano, Field CTO of Anyscale, opened the day with a look at where AI infrastructure is headed, and why the workloads pushing teams to rethink their stacks (multimodal data at scale, RL training loops, multi-node LLM inference) don't fit cleanly into the tools most teams started with. From there, the morning shifted into four back-to-back talks from practitioners running AI in production on Ray.

The throughline across all four: each team had outgrown a stack that worked yesterday, and Ray was the bet that unlocked what was next.

LinkMultimodal AI is easier with Ray: Torc Robotics on autonomous trucking

Presented by Neil Wadhvana, Staff ML Engineer and ML Ops Tech Lead, Torc Robotics

Neil laid out the stakes for physical AI in autonomous trucking: a ~$200B addressable U.S. long-haul market by 2030, a 160K driver shortage, and ~95% of fatal large-truck crashes traceable to driver-related factors. He then walked through what changed inside Torc's data-driven engineering loop, which included the full data lifecycle from production – from devices or synthetic – to consumption as part of analytics, model training and testing.

Their old multimodal data and AI processing stack were fragmented across five different systems with duplicated data formats and training loops, and GPUs were starved while CPUs on the same boxes maxed out doing data prep. After consolidating into a single modular Python compute engine with Ray, the team was able to run independently scaled CPU and GPU node pools in order to support their full data and AI lifecycle.

This unified engine for heterogeneous compute enabled the team to increase average GPU utilization from 30–40% to around 90%. On a multi-task perception model, moving to Anyscale cut one epoch from 20 minutes to 5, a 4x improvement on the same resources. The same pipelines now scale from 4 TB to 38 TB of training data in roughly the same wall-clock time.

Want to go deeper on Torc's multimodal AI work? Register for the webinar: How Torc Robotics Scales Multimodal AI for Autonomous Driving with Ray.

LinkFrom open source Ray to Anyscale: Discord's ML platform evolution

Presented by Serrana Aguirregaray, Senior Software Engineer, Machine Learning Platform, Discord

Serrana walked through Discord's two-phase journey: building the first production ML platform on open-source Ray, then migrating to Anyscale once usage outgrew their ops capacity. Discord's ML team first turned to Ray when recommendation models needed more data than single-node XGBoost training could handle.

To build a scalable platform that could support multiple ML use cases – Safety, Ads, Recommendations, Shop, and Content Understanding – across 90+ million daily active users, the team decided to deploy Ray on Kubernetes. The Phase 1 platform combined a custom CLI, Dagster, and KubeRay on GKE, successfully shipping the company’s first large-scale deep learning model into production along with a +200% improvement in Ads ranking. Soon after, adoption rapidly grew across other ML teams.

However, as Ray adoption surged, managing multiple Kubernetes-based Ray clusters created new challenges around configuration management, troubleshooting, reliability, and platform team overhead. The platform team was spending more time keeping the lights on than building models. Moving to Anyscale Platform brought a single control plane across all clusters, declarative configs, and built-in Workspaces for interactive development, while keeping the abstraction stable with the same CLI, the same Dagster, and the same Ray code. Without breaking anything engineers already depended on, migrating existing models took just a few lines of change.

Serrana's three takeaways for those building an AI platform with Ray: developer experience is the multiplier, evolve from success rather than failure, and keep the abstractions stable because engineers shouldn’t know or care what manages their cluster.

Learn more about Discord’s first deep learning use case in their engineering blog.

LinkBuilding a model fitting framework for quant finance with Ray: Cubist/Point72

Presented by Todd Gaugler, Quant Developer, Cubist Systematic Strategies (Point72)

Todd took the audience inside Cubist's Core Research Technology team, which runs on-premise multi-tenant Ray clusters plus Anyscale in the cloud, and explained what makes model fitting in quant finance distinctive – a bias toward linear modeling, lots of time-series data of varying frequency, and "windowed" fitting problems. He walked through three failure modes and their workarounds: over-scheduling data loading tasks (use object reference generators, carefully), wasting cores on fitting tasks (make the scheduler aware of upstream data dependencies and keep Ray's object reference counting protocol in mind), and inefficient memory usage that triggers OOMs (force a contiguous numpy layout when the dataset fits in memory, and return a lightweight metadata reference alongside the data).

For teams exploring similar work, Point72 has also open-sourced two projects built on top of Ray: raydar and csp. Check them out at github.com/point72.

LinkScaling batch ML with Anyscale Job Queues: Coinbase

Presented by Aman Choudhary, ML Platform Engineer, Coinbase

Aman closed the user talks with a case study on Coinbase's Financial Risk workload: hundreds of batch jobs per day producing risk predictions that inform risk management decisions across the company. After migrating from SageMaker to self-managed Ray in 2023 (cutting iteration time from hours to seconds and training costs by ~20%) and then to Anyscale in 2024, the team partnered with Anyscale over multiple quarters to harden Anyscale Job Queues for production-scale batch ML.

Fixes shipped together with automatic cluster recycling, runtime environment provisioning improvements, a persistent tasks dashboard, and log ingestion latency fixes via Ray 2.50. The result: scaling from ~3K degraded jobs to 10K+ stable jobs, few to no manual interventions, and upstreamed fixes that benefit other Anyscale Job Queues customers.

For more on how Anyscale Job Queues power recurring batch ML workloads, check out Introducing Anyscale Job Queues.

LinkGetting hands-on and looking ahead

In the afternoon, the Anyscale team led parallel workshop tracks so attendees could go deeper on whichever path matched their work.

The Ray track moved through Ray Core fundamentals (tasks, actors, object store), then to building scalable multimodal data pipelines with Ray Data, and finally to distributed training at scale with Ray Train and PyTorch. The VLA track focused on physical AI, starting with Ray as the foundation for distributed physical AI workloads, then walking through large-scale vision-language-action (VLA) model fine-tuning with Ray Data and Ray Train, and closing with robotics simulation using Ray Core to parallelize compute-intensive environments like MuJoCo and Isaac Sim.

A separate invite-only roundtable gave a select group of customers an early look at the Anyscale product roadmap for the coming quarters.

LinkThank you to our sponsors

A big thank you to AWS, CoreWeave, and Microsoft for sponsoring Ray Day: NYC and supporting the Ray community.

LinkNext stops

NYC was one of several stops on Ray on the Road this year, but the biggest moment for the Ray community is still ahead.

Ray Summit lands in San Francisco on August 24 for three days of keynotes, breakout sessions, and hands-on training with the teams building Ray and the practitioners scaling it in production. It's where the conversations from every Ray Day come together.

From the community: Ray user talks
Multimodal AI is easier with Ray: Torc Robotics on autonomous trucking
From open source Ray to Anyscale: Discord's ML platform evolution
Building a model fitting framework for quant finance with Ray: Cubist/Point72
Scaling batch ML with Anyscale Job Queues: Coinbase
Getting hands-on and looking ahead
Thank you to our sponsors
Next stops

Sharing

Sign up for product updates

Introducing the Anyscale Physical AI Skill

How Torc hit 90% GPU utilization and other stories on scaling AI with Ray from Discord, Cubist, and Coinbase

LinkFrom the community: Ray user talks

LinkMultimodal AI is easier with Ray: Torc Robotics on autonomous trucking

LinkFrom open source Ray to Anyscale: Discord's ML platform evolution

LinkBuilding a model fitting framework for quant finance with Ray: Cubist/Point72

LinkScaling batch ML with Anyscale Job Queues: Coinbase

LinkGetting hands-on and looking ahead

LinkThank you to our sponsors

LinkNext stops

Table of contents

Sharing

Tags

Sign up for product updates

Recommended content

Introducing the Anyscale Physical AI Skill

Enhancing Ray Cluster Stability With Resource Isolation

Ray Data 2.56: Improving Reliability for AI Data Pipelines

Explore Anyscale today