Ray Meetup

Ray Community Talks: Grid.ai, DeepChem.io, and ByteDance

Thursday, May 26, 1:00AM UTC

Each month we get together to discuss Ray and Ray’s native libraries for scaling machine learning workloads. For May’s meetup, we have invited Ray community members Stanley Bishop (DeepChem.io), Jiaxin Shan (ByteDance), and Aniket Maurya (Grid.ai) to share how they use Ray to solve challenging ML problems.

Agenda

(The times are not strict; they will vary slightly.)

  • 6:00 p.m.: Welcome remarks, announcements, and agenda - Jules Damji, Anyscale

  • 6:05 p.m.: Talk 1: Deep learning for protein engineering with Ray - Stanley Bishop DeepChem.io

  • 6:35 p.m.: Q&A

  • 6:40 p.m.: Talk 2: Introduction to KubeRay - Dmitri Gekhtman, Anyscale & Jiaxin Shan, ByteDance

  • 7:20 p.m.: Q&A

  • 7:25 p.m.: Talk 3: AutoML with PyTorch and Ray - Aniket Maurya, Grid.ai

  • 7:50 p.m.: Q&A

Talk 1: Deep learning for protein engineering with Ray

SLIDES >>>

We will discuss Ray as an active learning orchestrator for protein engineering in the drug/medicine discovery process. In particular, we will look at the deployment of systems that involve active-learning feedback between sequence-to-sequence transformers, Alphafold driven sequence-to-structure prediction, and, more broadly, how these two deep learning methods are revolutionizing the field.

Stanley Bishop is an ML-nerd contributor to the open source-project DeepChem.io, which works to democratize deep learning for science.

Talk 2: Introduction to KubeRay

SLIDES >>>

In this introductory session, we will introduce KubeRay, a Ray cluster management tool built on top of Kubernetes. We will talk about the motivation behind KubeRay, the difference between KubeRay and ray-operator in the Ray core, recent v0.2.0 features, and future updates.

Dmitri Gekhtman is a software engineer on the Infrastructure team at Anyscale. His areas of focus include autoscaling and the integration of Ray into Kubernetes environments.

Jiaxin Shan is a software engineer focusing on serverless infrastructure and cloud-native adoption at Bytedance.

Talk 3: AutoML with PyTorch and Ray

SLIDES >>>

Gradsflow is an open source AutoML library based on PyTorch. It provides automatic model building and training for various deep learning tasks like image classification and text classification. Furthermore, it leverages Ray for hyperparameter tuning and scaling training on a laptop or a cluster of machines. In this talk, we will look at the internals of Gradsflow with Ray and how to tune hyperparameters of your model quickly.

Aniket is a machine learning software engineer and is currently a developer advocate at Grid.ai.


Speakers

Dmitri

Dmitri Gekhtman

Software Engineer, Anyscale

Dmitri Gekhtman is a software engineer on the Infrastructure team at Anyscale. His areas of focus include autoscaling and the integration of Ray into Kubernetes environments.

Stanley Bishop-Rosal

Stanley Bishop

Head of Data Science, Drift Biotechnologies

At Drift Biotechnologies, Stanley is head of data science and focuses on using deep learning to understand evolutionary patterns in viruses and bacteria. Stanley Bishop has worked in machine learning science for over a decade and recently he transitioned to applying ML to bioinformatics after having been an early contributor to the COVID-19 tracking project.

Jiaxin Shan

Jiaxin Shan

Software Engineer, Bytedance

Jiaxin Shan is a software engineer focusing on serverless infrastructure and cloud-native adoption at Bytedance.

Aniket Maurya

Aniket Maurya

ML Software Engineer & Developer Advocate, Grid.ai

Aniket is a Machine Learning-Software Engineer and currently a Developer Advocate at Grid.ai.