In this talk, we will present our effort to run Ray on YARN, and the integration of Ray on LinkedIn’s open-sourced offline infrastructure: Azkaban (workflow scheduling service) and TonY (Tensorflow on Yarn). We will provide a demo of running a Ray job end-to-end, discuss the architectural decisions and talk about our cooperation with the Ray team on this effort.
In the later half of the talk, we will share how we used Ray Tune on Kubernetes in a real world use case. Tune helped us identify promising model configurations, using state of the art Bayesian optimization algorithms like TPE & PBT, with minimal supervision. The data pipeline is also optimized using Tune to extract the maximum throughput and we were able to train 2x faster by reducing the GPU idle times.
Jonathan Hung is a senior software engineer on the Hadoop development team at LinkedIn.
Nitin Pasumarthy is working as an Applied Machine Learning at LinkedIn, where he experiments with new ways to improve member experience by making the site faster. He is currently working on models that can predict the network quality information of every user in real time. Prior to this he was a full stack developer with focus on databases and UI. He is a big movie buff and regularly plays cricket / volleyball / badminton.