HomePress Anyscale Cuts Multimodal AI Data Processing Costs by 80% with NVIDIA RTX PRO 4500 Blackwell

Anyscale Cuts Multimodal AI Data Processing Costs by 80% with NVIDIA RTX PRO 4500 Blackwell

San Francisco, CA — March 16, 2026 — Anyscale, founded by the creators of Ray, today announced upcoming new capabilities in Ray and the Anyscale platform designed to help teams build and deploy AI workloads at production scale. As more teams seek to build differentiated AI, whether fine-tuning visual-language-action models (VLAs) in robotics or scaling enterprise document processing for RAG and search, transforming complex data modalities such as images, video and documents into AI-ready datasets remains a critical bottleneck in both building and deploying models in production. 

To unlock new levels of ROI on AI investments, today we are announcing the integration of Ray Data with NVIDIA cuDF. This integration enables GPU-native multimodal data processing, enabling 80% lower cost with NVIDIA RTX PRO 4500 Blackwell Server Edition available soon on AWS EC2.

In addition, as the industry continues to adopt more complex model development workflows such as large scale reinforcement learning for LLMs, Anyscale is introducing Ray’s rack-aware scheduling for NVIDIA GB300 NVL72 clusters, enabling optimal placement of distributed AI workloads to leverage Nvidia’s high-speed interconnect technology (NVLink). 

“AI systems are growing in complexity, from reinforcement learning pipelines that combine simulation, data generation, training, and inference, to multimodal data preparation for RAG and robotics," said Robert Nishihara, Co-Founder of Anyscale. “Ray serves as a unified compute engine across all of these GPU-powered workloads, giving teams programmatic control to place workloads on the hardware best suited for the job, whether that's NVIDIA RTX PRO 4500 Blackwell for data preparation or NVIDIA GB300 NLV72 for large training runs.”

These advancements to Ray reflect Anyscale’s commitment to advancing open-source AI at scale and making it production-ready for every organization. As AI builders expand training, fine-tuning, and reinforcement learning with multimodal data pipelines – where text, documents, images, and video are processed on GPUs – the ability to efficiently orchestrate AI infrastructure at scale is becoming mission-critical to accelerate end-to-end experimentation.

Multimodal Data Processing with cuDF in Ray Data

Modern AI pipelines are no longer training-only workloads. Preparing text, images, video, and multimodal embeddings increasingly relies on GPUs, as these steps often use AI models directly. 

As demand for multimodal data pipelines grows - from processing documents with tables for retrieval and search applications,  to preparing logs and images to fine-tune a visual language models (VLMs), to continuously analyzing user activity as part of reinforcement learning systems - inefficient orchestration can quickly limit performance. 

To support this shift, Ray is expanding GPU-native data processing capabilities by adding support for NVIDIA cuDF within Ray Data. Ray Data simplifies distributed multimodal data processing and batch model inference across heterogeneous (CPU and GPU) clusters.  With cuDF integration, teams can run GPU-accelerated structured data processing collocated with training on GPU clusters, including the new GB300. On initial large scale data deduplication tasks, Ray Data’s new capabilities are able to reduce cost by 80% on RTX PRO 4500 Blackwell, compared to equivalent CPU-only pipelines. These new capabilities  enable data preparation and training to operate as a unified distributed system rather than separate infrastructure layers, reducing bottlenecks and improving end-to-end throughput.

Rack-Aware Scheduling for Large-Scale AI Workloads

The NVIDIA GB300 NVL72 platform introduces a new class of AI infrastructure, delivering up to 72 GPUs per rack connected by NVLink ultra-high-bandwidth, low-latency interconnects. While a single rack provides exceptional density, advanced AI workloads routinely scale to 100–500+ GPUs spanning multiple racks. At this scale, clear mapping of workloads into the physical topology directly impacts performance and efficiency. 

To address this challenge, Ray introduces rack-aware scheduling, enabling distributed workloads to be explicitly mapped to the physical topology of NVIDIA GB300 NVL72 clusters.

With rack-aware scheduling, developers use simple Python APIs to express placement intent for tightly coupled tasks such as distributed training jobs, gradient synchronization, reinforcement learning learners, and GPU-intensive data preprocessing pipelines. Ray automatically coordinates scheduling to keep communication-intensive workloads within the same rack based on user specifications, maximizing intra-rack bandwidth and reducing costly cross-rack traffic.

Advancing GPU Utilization at Multi-Rack Scale

Organizations already rely on the Anyscale platform to efficiently operate large NVIDIA H100 and H200 GPU fleets, achieving over 80% GPU utilization in production environments. Rack-aware scheduling extends this foundation to next-generation GB300 systems, helping teams translate rack-scale GPU density into improved workload performance and more effective use of scarce GPU compute. These capabilities complement Anyscale’s broader AI workload orchestration features, including:

  • Priority-aware orchestration for fair sharing of GPU resources across regions or cloud providers.

  • Fine-grained fractional GPU allocation to pack more into every GPU

By extending intelligent orchestration to larger systems, Anyscale ensures hardware innovation directly translates into measurable efficiency gains for AI teams operating at scale.

These new platform enhancements build on Anyscale’s continued momentum as AI labs, robotics teams, and enterprises standardize on AI-native computing to improve developer velocity, production resilience, and cost efficiency.

Rack-aware scheduling and NVIDIA cuDF support on Ray Data will be available in open-source Ray and the API compatible Anyscale Runtime. 

About Anyscale

Anyscale, founded by the creators of Ray, is pioneering the era of AI-native computing. Its platform enables developers and enterprises to easily build, run, and scale AI workloads — from multimodal data processing to training and inference — optimized for modern accelerators. With Anyscale, AI teams get the fastest, most reliable Ray experience to power the next generation of AI applications and platforms. Learn more at www.anyscale.com