ML Infra + Apps

Running a question-answering system on Ray Serve

Tuesday, August 23
11:30 AM - 12:00 PM

In this talk, we are going to discuss how to run a question-answering system on Ray Serve. To that end, we'll cover the key architectural components used in question-answering systems, such as a data store, indexing pipeline, and querying pipeline. Haystack is an open source framework that allows you to connect a collection of multiple transformer state-of-the-art NLP models into a single pipeline. Using Ray, we'll discuss how to deploy it all for a GPU-empowered inference. Key learnings include how to assemble NLP models into pipelines for empowered inference; how to run Hugging Face models on Ray Serve; how to deploy a NLP model pipeline using Ray Serve; and how to access persistent storage from code deployed on Ray Serve.

About Dmitry

Dmitry Goryunov started his career in 2007, and through all that time he has enjoyed building distributed software systems in areas like market research, e-commerce, and health tech. For the last few years, Dmitry has been an MLOps engineer and makes sure that ML models reach end users. Currently he lives in Berlin and works at deepset, where he contributes to a platform for question answering and semantic search systems.

Dmitry Goryunov

MLOps Engineer, deepset
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.