Ray Deep Dives

Highly available architectures for online serving in Ray

Tuesday, August 23
3:15 PM - 3:45 PM

In this talk we present highly available (HA) serving, one of the features of Ray 2.0. While Ray users thus far have been able to achieve high availability for their online serving workloads (with Ray Serve) using load-balancing across multiple Ray clusters, there is a material benefit to eliminating all single points of failure even within a single Ray cluster. HA serving within single Ray clusters allows users to experience less disruption during head-node failures, and can improve the efficiency of the cluster. This talk explains how HA serving works in Ray 2.0 at the architecture level, its supported functionality, and how you can deploy it in practice.

About Simon

Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.

Simon Mo

Software Engineer, Anyscale
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.