
Seiji Eicher
Software engineer- Distributed LLM Inference
In-person · Las Vegas · April 22 - 24 2026
April 22, 2026 | 4:00PM - 4:25PM
Learn how to deploy the Qwen model on GKE with Ray Serve and vLLM for fast, scalable inference. Discover how to integrate an ADK agent for advanced chat and tool use, leverage TPU nodes, and use Ray’s autoscaling and fault tolerance to build enterprise-ready agentic AI systems.

Software engineer- Distributed LLM Inference

Senior AI Engineer
April 24, 2026 | 11:00AM - 11:45AM
Join this session to explore Mistral AI’s RL strategies and Anyscale’s high-performance Ray on Google Kubernetes Engine (GKE). We’ll analyze GKE primitives for faster RL loop times, focusing on sampling, weight transfer, and sandboxing for isolation.

Member of Technical Staff

Product Manager

Software Engineer
Visit Coreweave booth at #1815 at 4:30pm on April 22 to catch a live demo.