Ray Meetup Community Talks - January 2023

Thursday, January 26, 1:00AM UTC

We are delighted to kick off New Year with our first January Ray Meetup with talks from Ray community users and committers. Join us to hear from the Ray team at Anyscale and Shopify about Ray and its usage.

Agenda:
5:00 p.m. Welcome remarks, Year 2022: Ray in Review & upcoming announcements - Jules Damji, Anyscale
Talk 1 (35-40 mins): Monitor & prevent out-of-memory problems with Ray OOM monitor - Clarence Ng, Anyscale
Q & A (10 mins)
Talk 2 (35-40 mins): How Shopify used Ray<>Tensorflow to build a Product Hierarchical Categorization model to auto classify billions of products using NLP and Computer Vision, Kshetrajna Raghavan, Shopify
Q & A (10 mins)

Talk 1 : Monitor & prevent out-of-memory problems with Ray OOM monitor
Abstract: It is often difficult to predict or estimate the memory usage of a Ray program. This is often key for data ingestion or processing as it impacts the performance and stability of the training task. In the mild case the program slows down, and in the worse case it fails, incurring cost and wasted time.

Since the release of Ray 2.0 we have significantly improved the stability of the system. One such improvement is the handling of out-of-memory (OOM) issues triggered by the program. In this talk, we discuss the architecture of the OOM monitor, how to debug OOM using the latest observability tools, and a sneak peak of the upcoming change to the algorithm that will make the Ray more resilient to OOM issues.

This talk is for you if you're interested in:

How to debug Ray program that is suffering from OOM issues
How to configure the OOM monitor
The internals of the OOM monitor and how it interacts with some of the common workloads

Bio: Clarence Ng is a software engineer at Anyscale. He has experience building large scale distributed systems at places such as Amazon and Google, where he built ML ingestion pipelines, database service, cluster manager, and object storage service. He has contributed to the memory management subsystem for Ray.

Talk 2: How Shopify used Ray<>Tensorflow to build a Product Hierarchical Categorization model to auto classify billions of products using NLP and Computer Vision
Abstract:
Organizing products using structured metadata is crucial in online retail. This metadata is usually needed by many downstream applications including search and discovery, trust and safety, analytics and reporting among others. At Shopify we like to make the commerce journey as easy as possible for our merchants and one part of this is using Machine Learning to predict the product category for the billions of products that our merchants sell.

We will look at how we solved this problem using transfer learning through Natural Language Processing and Computer vision to create a hierarchical classification Deep Neural Network to categorize products into a hierarchical tree taxonomy. We will dig deeper into modeling challenges and how we came up with specific architecture decisions. We will then dive into how Ray and other tool choices made this work at Shopify Scale. The talk will cover how we continuously monitor the performance of the model using both ML as well as business metrics and how this leads into a feedback mechanism that results in better models.

Finally we will talk about how all of this was built keeping merchant success front and center of all the product as well as technical decisions we made by talking about different features that are built on top of this model that have benefited our merchants.

Bio: Kshetrajna is a Staff Data Scientist at Shopify working on the capital algorithms team. He has built and productionalized many models in various domains including retail, ad-tech and healthcare. His interests are mainly applied ML and ML systems and enjoys solving complex problems to help use machine learning at scale. Outside of work, Kshetrajna loves to spend time with his dogs, play music on his guitar, and is an avid gamer.

LinkWatch video

Other Events

[Ray Meetup] Scaling Multimodal AI: Lessons from Netflix

06 . 26 . 2025 , 12:30 AM (PST)

[Ray Meetup] Ray + vLLM in Action: Lessons from Pinterest and Deepseek Deployments

06 . 11 . 2025 , 01:00 AM (PST)

[Ray Meetup] LLMs on Ray + LanceDB

04 . 18 . 2025 , 01:00 AM (PST)