All Posts

Asawari Samant 2
05 . 11 . 2021

Why I joined Anyscale

A bold vision — to make distributed computing simple and accessible from anywhere. Open source model, with a developer-first mindset. A vibrant and rapidly growing community. And a stellar team, with a track record of building great technology to sol...

Anyscale Platform
05 . 04 . 2021

Why you should build your AI Applications with Ray

Business logic and model inference have traditionally been handled by different systems. This post describes how Ray breaks down silos by supporting both these workloads seamlessly, allows developers to build and scale microservices as if they were a...

SpaceX Falcon9 rocket
04 . 21 . 2021

Attention Nets and More with RLlib's Trajectory View API

In this post, we’re announcing two new features now stable in RLlib: Support for Attention networks as custom models, and the “trajectory view API”. RLlib is a popular reinforcement learning library that is part of the open-source Ray project.

03 . 30 . 2021

Online Resource Allocation with Ray at Ant Group

Double 11 has become the largest online shopping event in the world. To support this level of online activity, Ant Group has implemented a flexible, high-performance, stable, and scalable online resource allocation system based on Ray.

ThumbnailNo MapReduceSystem
03 . 22 . 2021

Executing a distributed shuffle without a MapReduce system

A distributed shuffle is a data-intensive operation that usually calls for a system built specifically for that purpose. In this blog post, we’ll show how a distributed shuffle can be expressed in just a few lines of Python using Ray, a general-purpo...

How to Speed Up pandas with modin (main image)
03 . 03 . 2021

How to Speed Up Pandas with Modin

The pandas library provides easy-to-use data structures like pandas DataFrames as well as tools for data analysis. One issue with pandas is that it can be slow with large amounts of data. It wasn’t designed for analyzing 100 GB or 1 TB datasets. Fort...

PyTorch + Ray
03 . 02 . 2021

Getting Started with Distributed Machine Learning with PyTorch and Ray

Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning applications.

02 . 16 . 2021

Data Processing Support in Ray

This blog post highlights two features in the latest Ray 1.2 release: native support for spilling to external storage, and support for libraries from the Python data processing ecosystem, including integrations for PySpark and Dask.

02 . 10 . 2021

Retrieval Augmented Generation with Huggingface Transformers and Ray

Huggingface Transformers recently added the Retrieval Augmented Generation (RAG) model, a new NLP architecture that leverages external documents (like Wikipedia) to augment its knowledge and achieve state of the art results on knowledge-intensive tas...

02 . 03 . 2021

How to Speed up Scikit-Learn Model Training

This post gives an overview of different ways to speed up your scikit-learn models and discusses some limitations of each approach.