We are excited to announce the release of Ray version 1.6. Release highlights include:
Ray Datasets is a new high-level Ray library that brings native support for large-scale data loading to Ray
Runtime Environments, which was introduced in version 1.4 to enable better dependency packaging for Ray jobs, is now generally available
The Ray Autoscaler now includes support for GCP TPU
Ray Lightning brings fast and easy parallel training to PyTorch Lightning
You can run pip install -U ray
to access these features and more.
Without further ado, let’s dive into these highlights. Do check out the documentation for further details and all the other goodness that launched in this release.
Recently, we’ve seen a growing adoption of Ray for large-scale data processing use cases. Some of those stories (eg. Amazon and Intel) were presented at the recent Ray Summit. Since Ray lacked native support for large-scale data collections, these users either used low-level Ray primitives (eg. task and actors) or 3rd-party library integrations (eg. Spark, Dask, etc) to build and compose their distributed applications. These approaches are not ideal for 2 reasons:
Interoperability
Lack of high-level abstractions makes this less approachable
We are introducing Ray Datasets, a high-level library for loading, exchanging and transforming terabyte-scale datasets, to address this gap. Built on top of Apache Arrow, Ray Datasets brings the power of distributed data processing to all Ray users in a compact and easy-to-use library.
We envision the following use cases for Ray Datasets:
Common format for passing large datasets between application components: users can use the Ray Datasets to pass data between steps of an Ray application as an in-memory object — this is faster and avoids having to write large intermediate results to external storage
Data-intensive ingest and preprocessing for machine learning apps: users can seamlessly convert Ray Datasets into sharded PyTorch or TensorFlow Datasets for distributed training
Basic data loading and transformations: users can use Ray Datasets to apply bulk transformations on semi-structured data, including GPU-based maps and data shuffling
Ray Datasets was introduced as an alpha in version 1.5 and is quickly maturing; we would love your feedback. To learn more about Ray Datasets, visit the Ray documentation. Follow us on Twitter or sign up for the Ray Newsletter to be notified on the deep-dive blog on the topic.
When running Ray jobs, users want to easily package and distribute dependencies so that the worker nodes have all the Python and other OS dependencies they need to successfully run the code. Runtime Environments was introduced in version 1.4 to address this need — it allows you to define the dependencies required for your workload by passing a simple dictionary set parameter when running the job. We are excited to graduate the runtime environment feature to GA status in version 1.6.
Here are some of the use cases that runtime environment simplifies:
Distribute local Python packages and files to workers and actors
Install new Python packages as part of a development workflow
Run tasks/actors needing different or conflicting Python dependencies as part of one Ray application.
Dynamically set the runtime environment of tasks, actors, and jobs so that they can depend on different Python libraries while all running on the same Ray cluster.
Read more about the feature in the documentation. Don’t forget to give it a try and let us know what you think.
TPU, or Tensor Processing Unit, is an application-specific integrated circuits (ASICs) introduced by Google in 2016 to accelerate machine learning workloads, especially deep learning. TPUs power many Google services and products including Translate, Photos, Search, and Gmail. Google Cloud offers TPU based VMs.
In version 1.6, we are adding support for GCP TPU VMs to the Ray Autoscaler. This lowers the barrier to entry for users wanting to run their Ray Clusters on TPUs. TPU pods are currently not supported.
To learn more about this feature, see this example.
PyTorch Lightning is a high-level interface library to PyTorch that abstracts away a lot of the engineering code in PyTorch calls. While it natively supports parallel model training across multiple GPUs, training on multiple nodes on the cloud can be a non-trivial undertaking.
We are excited to introduce Ray Lightning, a simple plugin for PyTorch Lightning, to address these limitations. Ray Lightning provides limitless parallel training to your existing PyTorch Lightning scripts with minimal code changes.
No changes to existing training code.
Scale from 1 GPU to 1000s with 1 parameter tweak
Seamlessly work from Jupyter notebooks
Integrates with Ray Tune for large-scale distributed hyperparameter tuning
To learn more about Ray Lightning, read this deep dive blog from the authors. You can try out Ray Lightning with pip install -U ray_lightning.
That sums up the release highlights. To see all the features and enhancements in this release, visit the release notes.