ML Infra + Apps

Running a question-answering system on Ray Serve

Ray Summit 2022

In this talk, we are going to discuss how to run a question-answering system on Ray Serve. To that end, we'll cover the key architectural components used in question-answering systems, such as a data store, indexing pipeline, and querying pipeline. Haystack is an open source framework that allows you to connect a collection of multiple transformer state-of-the-art NLP models into a single pipeline. Using Ray, we'll discuss how to deploy it all for a GPU-empowered inference. Key learnings include how to assemble NLP models into pipelines for empowered inference; how to run Hugging Face models on Ray Serve; how to deploy a NLP model pipeline using Ray Serve; and how to access persistent storage from code deployed on Ray Serve.

About Tobias

Building open source NLP frameworks and operating them at scale has been the focal point of Tobias�s career.
After being one of the core maintainers of the Rasa Open Source framework, he is currently working at deepset where he helps build a semantic search platform for question answering.

Tobias Wochinger

Senior Software Engineer, deepset
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.