TRAINING: Machine learning model deployment and serving with Ray Serve

Ray Serve is a framework-agnostic and Python-first machine learning model serving library built on Ray. This training will cover how Ray Serve makes it easy to deploy, operate, and scale a machine learning model using Ray Serve APIs.

Key takeaways:
- Understand Ray Serve architecture, components, and flow of requests across replicas
- Learn how to use Ray Serve APIs to create, access, and deploy your models and mechanisms to access model deployments via Python APIs and HTTPs endpoints
- Implement common model deployment patterns for serving ML models using the inference graph API as a directed acyclic graph (DAG)
- Scale up/down individual components of an inference graph node, utilizing appropriate hardware resources (GPUs/CPUs) and replicas
- Use operational-friendly APIs to integrate with your custom CI/CD
- Inspect load and deployments in a Ray dashboard

About Edward

Edward Oakes is a software engineer and project lead on the Ray Serve team. He works across the stack at Anyscale, from Ray Core to Ray Serve to the Anyscale platform.

About Simon

Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.

Edward Oakes

Software Engineer, Anyscale

Simon Mo

Software Engineer, Anyscale
Ray Summit 2022 logo blue

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.