Join us for our next Ray Meetup where we’ll explore batch inference at scale with Ray and vLLM! Learn how Pinterest scales batch inference using Ray, and get a first look at Anyscale’s latest tools—Ray Serve and Data LLM—for orchestrating large-scale LLM inference. We’ll cover topics like batch inference, prefill-decode disaggregation, DP/EP parallelism, and custom request routing.
📆 Tuesday, June 10th, 2025
🕔 5:00pm
📌 55 Hawthorne St, San Francisco
Speakers:
Chia-Wei Chen, Software Engineer, ML Training Infra, Pinterest
Kourosh Hakhamaneshi, AI Lead, Anyscale
Learn more and register here.