Posts by Clark Zinzow

02 . 14 . 2022

Ray Datasets for large-scale machine learning ingest and scoring

We're happy to introduce Ray Datasets: A data loading and preprocessing library built on Ray that leverages Ray’s task, actor, and object APIs to enable large-scale machine learning ingest, training, and inference within a single Python application.

3rdGenTasks andActors
11 . 30 . 2021

Deep Dive: Data Ingest in a Third Generation ML Architecture

Distributed libraries allow improved performance by exploiting the full bandwidth of distributed memory, and giving greater programmability. But how does that actually work? What does the code look like?

In this post, we’ll be looking at a concrete...

02 . 16 . 2021

Data Processing Support in Ray

This blog post highlights two features in the latest Ray 1.2 release: native support for spilling to external storage, and support for libraries from the Python data processing ecosystem, including integrations for PySpark and Dask.