Ray Summit 2022
Modern large language models require distributed training strategies due to their size. The challenges of efficiently and robustly training them are met with rapid developments on both software and hardware frontiers. In this talk, we explore challenges and design decisions associated with developing a scalable training framework, and present a quantitative analysis of efficiency improvements coming from adopting new software and hardware solutions, such as Ray, JAX pjit, and TPUv4.
Joanna Yoo is a machine learning engineer at Cohere, where she is building a scale-first training framework that powers language models. She uses JAX, TPUv4, and Ray to scale language models to hundreds of billions of parameters.
Kuba Perlin is a machine learning engineer at Cohere, working with JAX, TPUv4, and Ray to scale language models to hundreds of billions of parameters.
Siddhartha Rao Kamalakara is a machine learning engineer and is one of the lead developers of FAX. His interests lie at the intersection of systems and ML. He has previously worked on ML + proteins, sparsity, and efficient matrix approximations. Outside of work, he is into filmmaking and photography.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.Save your spot