ML Infra + Apps

Scalable feature engineering with Hamilton on Ray

Ray Summit 2022

Hamilton (https://github.com/stitchfix/hamilton) is an open source, declarative, general purpose, dataflow micro-framework, written in Python. It was originally created to manage complexities of scaling a team along with a time-series feature engineering code base past thousands of features at Stitch Fix. At a high level, in this talk, we'll cover: what Hamilton is and why it was created, how to use it for feature engineering, and how you can scale computation easily with the out-of-the-box Ray integration. At a low level, through code in the slides and a quick demo, you'll walk away with an understanding how a Data Science team at Stitch Fix scaled their team and code base with Hamilton, what Hamilton is and the declarative API paradigm it prescribes as opposed to traditional approaches, and lastly how the Ray integration with Hamilton works and how you can utilize it.

About Stefan

Stefan Krawczyk loves the stimulus of working at the intersection of design, engineering, and data. He grew up in New Zealand, speaks Polish, and spent formative years at Stanford, LinkedIn, Nextdoor, and Idibon. He currently leads the Model Lifecycle Team at Stitch Fix. Outside of work in a pre-COVID world, Stefan liked to swim, eat tacos, drink beer, and travel; for the past two years, he has instead biked, ate tacos, and baked sourdough.

Stefan Krawczyk

Mgr. Data Platform, Stitch Fix
chucks
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot
register-bottom-mobile
beanbags

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.