Ray Serve is a framework-agnostic and Python-first machine learning model serving library built on Ray. This training will cover how Ray Serve makes it easy to deploy, operate, and scale a machine learning model using Ray Serve APIs.
Key takeaways:
- Understand Ray Serve architecture, components, and flow of requests across replicas
- Learn how to use Ray Serve APIs to create, access, and deploy your models and mechanisms to access model deployments via Python APIs and HTTPs endpoints
- Implement common model deployment patterns for serving ML models using the inference graph API as a directed acyclic graph (DAG)
- Scale up/down individual components of an inference graph node, utilizing appropriate hardware resources (GPUs/CPUs) and replicas
- Use operational-friendly APIs to integrate with your custom CI/CD
- Inspect load and deployments in a Ray dashboard
Edward Oakes is a software engineer and project lead on the Ray Serve team. He works across the stack at Anyscale, from Ray Core to Ray Serve to the Anyscale platform.
Simon Mo is a software engineer working on Ray Serve at Anyscale. Before Anyscale, he was a student at UC Berkeley participating in research at the RISELab. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.
Save your spot