How Canva Built a Modern AI Platform Using Anyscale

Fulfilling Canva’s AI vision with no limits on scale.

canva case study hero image

12x

model evaluation

100%

GPU utilization

50%

reduction in cloud costs

“With Anyscale, we have no ceiling on scale, and an incredible opportunity to bring AI features and value to our 170 million users”

Greg Roodt

ML Lead, Canva

Canva logo

Overview

Canva has more than 100 machine learning models in production, powering its products as well as internal operations. Canva’s business is large and growing fast, with more than 170M monthly active users across 190 countries, producing over 20B designs to date.
Canva was early in seizing the AI opportunity, which has helped accelerate its growth over the last year. For Canva, the rapid advancements in AI have given the company the opportunity to reimagine the design process.

AI powers critical offerings like Magic Studio, a comprehensive AI-design platform boasting features like Magic Switch that intelligently resizes designs into different specs, and Magic Write, which creates a first draft of copy from a single text prompt. Users have been quick to adopt, using Canva’s AI features more than 4 billion times already.

Canva also operates non-generative AI workloads, including recommendation engines, image classifiers, and more. With over 100 machine learning models already deployed, Canva saw an opportunity to implement an AI platform to provide a scalable and future-proof foundation for both current and future workloads.

The Challenge

In an era where speed is essential, and quality and innovation can’t be compromised, distributed training was an especially critical capability for the Canva team. 

The team faced challenges scaling its prior solution. Constrained to a single machine, they struggled to efficiently process millions of images in a timely and cost-efficient manner. Training on images with a content library the size of Canva’s necessitates distributed training across multiple machines, introducing significant operational complexity.

Beyond that, Canva needed a solution to help power its growing suite of embedded AI tools, both generative and non-generative. Any new solution had to address today’s needs while being future-proofed for Canva’s innovation velocity and AI roadmap.

Our Solution

After a rigorous 3-month evaluation of an array of solutions, Canva chose Anyscale as a key partner to scale its next-generation AI platform. “Anyscale was the obvious choice,” said Greg Roodt, Head of Data Platforms at Canva. “We also wanted the flexibility of not being locked into a single cloud.”  Canva’s Anyscale-powered AI platform now delivers:
  • A shared platform for product teams. Product and Operations teams can develop and deploy their own models using the platform, without assistance from the platform team.
  • Distributed training at scale. Canva now easily trains generative and non-generative models across 80 AWS EC2 GPUs with maximized connectivity via AWS' Elastic Fabric Adapter technology, while elastically scaling data ingest and preprocessing across CPU machines.
  • Optionality. Anyscale’s open platform architecture and ecosystem enable Canva to take advantage of different models, deep learning frameworks, accelerators, data stores, or clouds based on its needs.
  • Control. Training their own models gives the Canva team complete control over model breadth and quality, training data, model evolution, and more.
  • Integration. The Canva team can seamlessly scale data ingest from any data source including Amazon Web Services S3 Object Storage service and Snowflake.
  •  
“Anyscale’s open source core matters to us. There’s a community. You can see the code. And we have an automatic exit strategy if we need one, which isn’t possible with proprietary software,” said Roodt.

The Impact

Canva has seen a positive impact from its Anyscale deployment, reflecting significant improvements along many dimensions.

  • Up to 12x faster innovation velocity: many models now train 4-6x faster. In one particular Image classification example, training is 12x faster, processing a single epoch previously took 90 minutes and is now 6 minutes.

  • Hardware efficiency: Canva’s GPUs are now fully-saturated at peak load, with no bottlenecks on data ingestion.

“We've been able to innovate quickly and successfully, and user feedback is very positive,” said Roodt. 

More Innovation Ahead

In its desire to deliver the most powerful, comprehensive AI offering in one streamlined platform, Canva plans to continuously expand and improve its generative AI capabilities, both with its own models and partner LLMs. 

The Canva team is in the process of standardizing all AI training on Anyscale for maximum efficiency and velocity while minimizing costs. “We want to shorten the time from training a model to shipping it so we can continuously improve the user experience, and Anyscale is making that possible.”

Canva is constantly evolving its AI platform with ambitious plans for the future. “Our new platform gives us optionality everywhere. Nothing’s off the table and our nimble approach means we can evolve as AI does,” said Roodt. “We have no ceiling on scale, and an incredible opportunity to bring AI features and value to our 170 million users.”