Home EventsSoftware 2.0 Needs Data 2.0: A New Way of Storing and Managing Data for Efficient Deep Learning

Ray Summit

Software 2.0 Needs Data 2.0: A New Way of Storing and Managing Data for Efficient Deep Learning

Wednesday, June 23 1:35 PM PDT

Every day, 90% of the data we generate is in unstructured form. However, current solutions for storing the data we create - Databases, Data Lakes, and Data Warehouses (or the Data 1.0 minions), are unfit for storing unstructured data. As a result, data scientists today work with unstructured data like developers used to work in the pre-database era. This slows down ML cycles, bottlenecks access speed and data transfer, and forces data scientists to wrangle with data instead of training models.

Creating Software 2.0 requires a new way of working with unstructured data, which we explore in this session. We present Data 2.0 - a framework bringing together all types of data under one umbrella, representing them in a unified tensorial form which is native to deep neural networks. The streaming process of the method is used for training and deploying machine learning models for both compute and data-bottlenecked operations as if the data is local to the machine. In addition, it allows version-controlling and collaborating on petabyte-scale datasets, as single numpy-like arrays on the cloud or locally. Lastly, we use Ray to improve our workflows.

Speakers

Davit Buniatyan

Founding CEO, Activeloop, Activeloop

Other Events

Ray Summit 2026

08 . 24 . 2026 , 07:00 AM (PST)

Ray Summit 2024

09 . 30 . 2024 , 03:00 PM (PST)

Ray Summit 2023

09 . 18 . 2023 , 03:30 PM (PST)