Ready to move beyond single-GPU limits and master distributed systems? Join us for a webinar where ML and platform engineers will explore how to scale model training from a single node to a massive cluster using PyTorch and Ray.
In this virtual session you will learn:
What is distributed Training ? And do we need it ?
Introduction to Distributed Data Parallel (DDP)
Utilize advanced DDP techniques with ZeRO-1, ZeRO-2, ZeRO-3, and FSDP.
Introduction to Ray and how you can use Ray Train to train models at scale
Training a model at scale using Ray Train and PyTorch at scale
This session is more than a demo. You’ll leave with a working understanding of Ray, a reusable project you can build on, and a clear view of how Ray and Anyscale work together to accelerate AI development.
Seats are limited to keep the experience interactive. Reserve your spot today, and come ready to code!