Ray Deep Dives

Highly available architectures for online serving in Ray

Ray Summit 2022

In this talk we present highly available (HA) serving, one of the features of Ray 2.0. While Ray users thus far have been able to achieve high availability for their online serving workloads (with Ray Serve) using load-balancing across multiple Ray clusters, there is a material benefit to eliminating all single points of failure even within a single Ray cluster. HA serving within single Ray clusters allows users to experience less disruption during head-node failures, and can improve the efficiency of the cluster. This talk explains how HA serving works in Ray 2.0 at the architecture level, its supported functionality, and how you can deploy it in practice.

About Simon

Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.

About Yi

Yi Cheng is a software engineer at Anyscale and a committer for the Ray project. He is interested in building efficient and reliable computation systems. He recently focused on Ray's reliability and scalability. Before joining Anyscale, Yi was a software engineer at TigerGraph, Facebook, and Baidu.

Simon Mo

Software Engineer, Anyscale

Yi Cheng

Software Engineer, Anyscale
chucks
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot
register-bottom-mobile
beanbags

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.