We are delighted to host our September Ray Meetup with talks from Ray community users and committers.
Welcome remarks, announcements, agenda, Ray Summit 2022 Highlights - Jules Damji, Anyscale
5:10 p.m. Talk 1
: Data transfer speed comparison in a distributed ML application: Ray Plasma Store vs. S3 - Ankur Mohan, Capital One
Q & A
State of Ray Serve in Ray 2.0 - Simon Mo, Anyscale, Ray Team
6: 50 p.m.
Q & A
Talk 1: Data transfer speed comparison in a distributed ML application: Ray Plasma store vs. S3
Scattering and gathering data within and across nodes in a compute cluster are frequent operations in distributed applications. Thus, an efficient, distributed data store that provides concurrent, high bandwidth R/W for large data is essential for scaling distributed applications. Several choices exist for such data stores. Some options such as network file systems and cloud storage solutions such as S3 are independent of the choice of the distributed computing framework (eg., DASK, Spark, Ray) whereas other options are an integral part of the distributed computing framework (eg., the plasma store with Ray).
This talk will present a detailed analysis of the pros and cons of some of these options for distributed data stores, using Ray as the distributed computing framework. We’ll use a speech2text application that needs to load a deep neural network model and scatter this model to several workers that perform the speech2text transcription task on audio files concurrently.
We will analyze two scenarios. In the first scenario, the model is downloaded to the plasma store and scattered to workers performing speech2text conversion via object store replication. In the second scenario, each worker downloads the model independently from cloud storage (S3). The analysis of each scenario will consider various possibilities for scheduling the speech2text workers ranging from all workers being scheduled on the same node to all workers executing on separate nodes.
The experimental setup will be a RayCluster installed using the Ray-Kubernetes operator.
Attendees will come away with a thorough understanding of RayCluster setup on Kubernetes using the operator pattern, and the trade-offs and nuances involved in data transfer in a distributed system
Bio: Ankur Mohan leads the Machine Learning as a Service (MLaaS) team at Capital One that enables the execution of ML models and workflows on a Kubernetes cluster with assured least privilege data access, compute and network isolation across tenants and robust support for scaling and observability. Prior to Capital One, Ankur led the AI/ML practice at In-Q-Tel, a VC firm. Outside of work, Ankur enjoys biking, kitesurfing, traveling and blogging. The material presented in this talk is Ankur's personal work and unrelated to his job at Capital One.
Talk 2: State of Ray Serve in Ray 2.0
Abstract: Ray 2.0 was released in late August, and it covers many features that help you scale out machine learning models serving in production! In this talk, we'll discuss the motivation behind Ray Serve, who's using Ray Serve and why, and recent features and updates in Ray 2.0.
In particular, we will cover Ray Serve autoscaling, the brand new deployment feature for multi-mode composition, and update to Serve’s fault tolerance feature on Kubernetes. Bring your questions about ML models in production as well.
Bio: Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.