HomeEventsGetting Started with Distributed Training at Scale

Webinar

Getting Started with Distributed Training at Scale

Ready to move beyond single-GPU limits and master distributed systems? Join us for a webinar where ML and platform engineers will explore how to scale model training from a single node to a massive cluster using PyTorch and Ray.

In this virtual session you will learn:

  • What is distributed Training ? And do we need it ?

  • Introduction to Distributed Data Parallel (DDP)

  • Utilize advanced DDP techniques with ZeRO-1, ZeRO-2, ZeRO-3, and FSDP.

  • Introduction to Ray and how you can use Ray Train to train models at scale

  • Training a model at scale using Ray Train and PyTorch at scale

This session is more than a demo. You’ll leave with a working understanding of Ray, a reusable project you can build on, and a clear view of how Ray and Anyscale work together to accelerate AI development.

Seats are limited to keep the experience interactive. Reserve your spot today, and come ready to code!