Case Study

Bedrock Robotics Builds Autonomous Construction Systems on Anyscale

Bedrock Robotics powers its full suite of AI workloads from multimodal data processing to model deployment with Ray on Anyscale.

85x

scale in compute, from 20K to 1.7M compute hours in under 12 months

40%

cost reduction via reliable use of spot instances

250+

engineering hours reclaimed from manual cluster management

Bedrock Robotics is building advanced autonomous systems for specialized heavy machinery, starting with excavators on construction sites. Their technology, the Bedrock Operator, uses large-scale machine learning models to transform everyday equipment with autonomy, enabling safer jobsites with fewer accidents and lower insurance rates, accelerated project schedules through new efficiencies and less rework, and the operational precision needed to fuel the next industrial revolution.

Rather than relying on hand-crafted heuristics, Bedrock trains models on large, diverse multimodal datasets collected from active jobsites across the United States, enabling general purpose robotics intelligence that adapts to any heavy machinery.

To build advanced models for autonomous construction, Bedrock needed a compute platform that could handle their end-to-end AI pipeline from the moment sensor data arrives in the cloud, through processing, labeling, and training, all the way to model deployment at scale.They went all in on Ray on Anyscale as the unifying AI platform.

LinkChallenges

By early 2025, Bedrock had proven their approach to general purpose intelligence but scaling it required solving the infrastructure problem, not only the modeling. To scale both their multimodal data processing pipelines and large scale training runs, the next challenge was building the infrastructure to support it all while keeping their teams focused on innovation, not infra plumbing.

Three challenges stood in the way:

Multimodal pipelines that existing tools weren't built for. Bedrock's pipelines don't fit neatly into CPU-only or GPU-only frameworks. A typical pipeline might decode video on GPUs, run reduction on CPUs, then run model inference on GPUs again, spanning object detection, segmentation, VLM calls, and scenario classification. Existing large-scale processing engines like Spark weren't designed for this kind of mixed compute, and self-managing heterogeneous clusters introduced significant operational friction for a team that needed to move quickly.
A lean platform team supporting a fast-growing company. In 2024, a 4-person platform team was supporting over 30 developers and researchers, with no bandwidth to manage distributed systems on Kubernetes and the tooling required on top of it, including self-service cluster spin up and priority-aware scheduling for dozens of concurrent job requests. As Bedrock's headcount and ambitions grew, so did the pressure on that team. They needed a platform that could scale with the company without scaling the infrastructure burden with it.
Cloud costs that would outpace business growth. Building AI for heavy equipment at scale is expensive by nature. Without a cost-efficient way to orchestrate compute across massive data volumes, cloud spend could grow faster than revenue. The team needed a platform that could maximize spot instance coverage and handle interruptions automatically, without requiring ongoing engineering effort to maintain it.

"Our workloads don't just use CPU or just GPUs. We get the best setup when we can mix and match throughout the same workload. You have a lot of friction to create and maintain those heterogeneous clusters in other systems. That's something Spark just isn't designed for."

Thomas Pelletier | Head of ML Platform & Infrastructure

LinkThe Solution

Bedrock evaluated two paths: build and manage their own Ray platform on Kubernetes, or focus on their differentiated work with a production-ready Ray platform through Anyscale. The decision to use Anyscale came down to focus: invest in what differentiates Bedrock, and partner directly with the team that builds Ray rather than maintaining it themselves. This partnership also meant getting the kind of expert support and co-design that only comes from working with the people who created the technology.

With Anyscale, Bedrock is able to:

Scalable compute for CPU + GPU multimodal data processing. Managing heterogeneous compute across a multi-stage robotics pipeline was one of Bedrock's core requirements from day one. Anyscale is involved from the moment robot data lands in S3. Jobs pick up MCAP files, video streams, and LiDAR packets, index and unpack them, then run them through a tiered labeling pipeline covering heuristics, VLM calls, and model inference before feeding curated data into training loops.
Support a 1-to-6 platform-to-researcher ratio with smart scheduling. Bedrock needed Anyscale to handle the infrastructure, so their platform team could focus on what mattered. Anyscale’s fully managed Ray clusters eliminated time spent on provisioning, maintenance, and failure recovery. Ray's Python-first programming model meant researchers could tap into distributed compute without deep infrastructure knowledge. By 2025, with 12 platform engineers supporting over 100 people across ML research, data science, and operations, Anyscale's priority-aware Workload Scheduler and built-in observability let users self-serve without escalating to the platform team.
Scale spot instance usage to 80% without ongoing engineering effort. Cost efficiency is a core requirement for Bedrock's platform team, and runaway cloud costs are a hard problem for any company building AI at scale. Anyscale's built-in spot instance support allowed the team to run the majority of their fleet on spot instances from day one, with no ongoing engineering effort required. Bedrock also runs jobs across multiple AWS regions to maximize spot availability and take advantage of price differences across regions.

"Managing Kubernetes and distributed compute is hard, and Anyscale keeps solving that, so we don't have to. Every hour we're not rebuilding infrastructure is an hour invested in what actually differentiates Bedrock."

Thomas Pelletier | Head of ML Platform & Infrastructure

LinkEnd-to-End Pipelines on a Single Platform

Bedrock's pipeline spans every stage of model development: ingesting raw robot sensor data, processing and labeling it, running training loops, and validating models. Each stage uses different compute, different Python and ML frameworks, and different levels of resource intensity. Stitching all of that together on a single platform, without rebuilding orchestration logic from scratch, was a core requirement from day one.

Anyscale is involved from the moment robot data lands in S3. Anyscale Jobs pick up MCAP files, video streams, and LiDAR packets, index and unpack them, then run them through a tiered labeling pipeline covering heuristics, and visual language model (VLM) inference calls before feeding curated data into training loops. From raw sensor data to a validated, packaged model ready for deployment on a robot, everything runs on Ray and Anyscale. In under 12 months, Bedrock scaled from 20,000 to 1.7 million vCPU/hour, an 85x increase in compute, all absorbed by the same Anyscale deployment they stood up in their first week using Terraform. What started as a test deployment during a demo site visit is still their production environment today.

"We went from 20,000 vCPU per hour to 1.7 million at the end of the year. Anyscale has been able to just grow and follow us through that massive growth, and we're expecting this to keep skyrocketing."

Thomas Pelletier | Head of ML Platform & Infrastructure

LinkSmarter Scheduling at Scale

In 2025, with 12 platform engineers supporting over 100 people across the company and 60+ active job submitters on any given day, Bedrock needed more than a compute platform. They needed a scheduler that could handle priority-aware job queuing across teams, enforce fairness between competing workloads, and give every user visibility into their job status without requiring the platform team to intervene. The tooling around the platform is the part that's often overlooked: where does one see logs, how to list running jobs, how many times did a job retry. Building all of that internally wasn't an option for a team of that size, which is why the Anyscale Workload Scheduler became central to how Bedrock manages compute across the organization.

Training jobs queue up and execute when resources are available, with users able to see their position in the queue, understand why a job is waiting, and cancel or reprioritize as needed. When a researcher asks why their job isn't running, the platform team points them directly to the Anyscale Scheduler UI, where job logs, queue position, compute utilization, and preemption history are all surfaced in one place. Bedrock's platform team also published an internal skill that teaches coding agents how to start, monitor, debug, and stop Anyscale jobs autonomously, enabling researchers to complete full iteration loops without human intervention.

"We really strongly believe that the Workload Scheduler is a great opportunity for Anyscale to do a lot of optimizations, solve a lot of problems that we're facing as we scale. And we often have a wish list for it, and that's where a lot of our attention as a team lands."

Thomas Pelletier | Head of ML Platform & Infrastructure

LinkCost Efficiency Without the Engineering Overhead

Robotics AI is compute-intensive by nature. Bedrock generates massive amounts of data per robot everyday, and processing all of it through expensive models isn't economically feasible. Runaway cloud costs are a hard problem in general, and especially so for a company building AI for heavy equipment at scale. Using Ray to handle the orchestration of that pipeline efficiently, with intelligent batching and filtering just in time, helps Bedrock keep costs in line while still getting strong performance.

On the infrastructure side, Anyscale's spot instance support has been a major lever. Running 80% of their fleet on spot instances, with minimal configuration effort and no ongoing engineering overhead, has significantly reduced cloud spend. In Q4 2025 alone, Anyscale's autoscaler handled 15,000 spot interruptions automatically across 94,000+ compute hours, with zero manual job recovery work required. That translated to a 40% cost reduction via spot automation, significant cost savings and hundreds of engineering hours saved.

"We've been running 70–80% spot instance usage across our fleet with no effort on our part. That's pretty incredible from a feature perspective."

Thomas Pelletier | Head of ML Platform & Infrastructure

LinkWhat's Next

Bedrock is expanding across cloud regions and evaluating multi-cloud options, driven by GPU capacity availability and spot price economics. As the company grows its fleet of autonomous machines, the data volumes, model complexity, and compute requirements will all continue to grow with it and Anyscale will be the platform to do so cost-efficiently.

"Anyscale is in a unique position to really tap into the whole vertical of the modern AI and robotics platform. You have the full stack from raw compute all the way to model deployment. It's rare to have a company with such a unique position."

Thomas Pelletier | Head of ML Platform & Infrastructure

"In less than a year, we scaled from 20,000 - 1.7M compute hours per month. Achieving that scale efficiently required Anyscale-powered spot instance usage (80% of our workloads) and the Anyscale Workload Scheduler that manages our job queues across all of research and engineering."

Thomas Pelletier

Head of ML Platform & Infrastructure, Bedrock Robotics

Bedrock Robotics Builds Autonomous Construction Systems on Anyscale

85x

40%

250+

LinkChallenges

LinkThe Solution

LinkEnd-to-End Pipelines on a Single Platform

LinkSmarter Scheduling at Scale

LinkCost Efficiency Without the Engineering Overhead

LinkWhat's Next

Want to give it a try?