You trained a ML model, now what? The model needs to be deployed for online serving and offline processing. This talk walks through the journey of deploying your ML models in production. I will cover common deployment patterns backed by concrete use cases which are drawn from 100+ user interviews for Ray and Ray Serve. Lastly, I will cover how we built Ray Serve, a scalable model serving framework, from these learnings.
Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.