Materials & Resources
Along with a demo, the talks will cover three functional areas of model serving with Ray Serve:
An overview of Ray Serve features and functionality and roadmap
On building multi-model inference pipelines with Ray Serve and scaling with Ray
Operationalizing Ray Serve
Join us if you are interested in serving and operationalizing ML models at scale using Ray Serve!
6:00 PM Welcome remarks, announcements & agenda by Jules Damji, Anyscale
6:05 PM “Ray Serve: Overview and roadmap,” Edward Oakes, Anyscale
6:15 PM Q&A
6:20 PM “Developing and deploying scalable multi-model inference pipelines,” Jiao Dong, Anyscale
6:45 PM Q&A
7:00 PM “Operationalizing Ray Serve,” Shreyas Krishnaswamy, Anyscale
7:25 PM Q&A
7:30 PM Demo
7:45 PM Q&A
Talk 1: Ray Serve: Overview and future roadmap
In this introductory session, we’ll discuss the motivation behind Ray Serve, who’s using Ray Serve and why, and recent features and updates, including a look at the future feature roadmap as we approach Ray 2.0.
Talk 2: Developing and deploying scalable multi-model inference pipelines
In this talk, we aim to show how to leverage the programmable and general-purpose distributed computing ability of Ray to facilitate authoring, orchestrating, scaling, and deployment of complex serving pipelines as a DAG under one set of APIs, like a microservice. Learn how you can program multiple models dynamically on your laptop as if you’re writing a local Python script, deploy to production at scale, and upgrade individually.
Talk 3: Operationalizing Ray Serve
In this session, we will introduce you to a new declarative REST API for Ray Serve, which allows you to configure and update your Ray Serve applications without modifying application files. Incorporate this API into your existing CI/CD process to manage applications on Ray Serve as part of your MLOps lifecycle.