Tuesday, August 23
3:15 PM - 3:45 PM
In this talk we present highly available (HA) serving, one of the features of Ray 2.0. While Ray users thus far have been able to achieve high availability for their online serving workloads (with Ray Serve) using load-balancing across multiple Ray clusters, there is a material benefit to eliminating all single points of failure even within a single Ray cluster. HA serving within single Ray clusters allows users to experience less disruption during head-node failures, and can improve the efficiency of the cluster. This talk explains how HA serving works in Ray 2.0 at the architecture level, its supported functionality, and how you can deploy it in practice.
Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.Save your spot