How Canva Built a Modern AI Platform Using Anyscale

TL;DR

Canva is an online visual communication platform with a mission to empower everyone in the world to design anything and publish anywhere. Canva’s market-leading visual communication platform provides a wide range of AI-powered features and services that have helped it grow rapidly to more than $2B USD in revenue. With Anyscale, Canva can test and evolve new machine learning models up to 12x faster than previously possible, scaling from 4 GPUs to 80 GPUs while achieving nearly 100% GPU utilization and reducing cloud costs by 50%. Canva is fulfilling its vision of accelerating AI development with no limits on scale. In the words of its team, “we wanted to go fast, and we wanted to go far.”

Overview

Canva has more than 100 machine learning models in production, powering its products as well as internal operations. Canva’s business is large and growing fast, with more than 170M monthly active users across 190 countries, producing over 20B designs to date.

Canva was early in seizing the AI opportunity, which has helped accelerate its growth over the last year. For Canva, the rapid advancements in AI have given the company the opportunity to reimagine the design process.

AI powers critical offerings like Magic Studio, a comprehensive AI-design platform boasting features like Magic Switch that intelligently resizes designs into different specs, and Magic Write, which creates a first draft of copy from a single text prompt. Users have been quick to adopt, using Canva’s AI features more than 4 billion times already.

Canva Magic Switch
Canva Magic Switch Uses Generative AI to Transform Designs to Alternate Sizes and Format

Canva also operates non-generative AI workloads, including recommendation engines, image classifiers, and more. With over 100 machine learning models already deployed, Canva saw an opportunity to implement an AI platform to provide a scalable and future-proof foundation for both current and future workloads.

The Challenge

In an era where speed is essential, and quality and innovation can’t be compromised, distributed training was an especially critical capability for the Canva team. 

The team faced challenges scaling its prior solution. Constrained to a single machine, they struggled to efficiently process millions of images in a timely and cost-efficient manner. Training on images with a content library the size of Canva’s necessitates distributed training across multiple machines, introducing significant operational complexity.

Beyond that, Canva needed a solution to help power its growing suite of embedded AI tools, both generative and non-generative. Any new solution had to address today’s needs while being future-proofed for Canva’s innovation velocity and AI roadmap.

The Solution

After a rigorous 3-month evaluation of an array of solutions, Canva chose Anyscale as a key partner to scale its next-generation AI platform. “Anyscale was the obvious choice,” said Greg Roodt, Head of Data Platforms at Canva. “We also wanted the flexibility of not being locked into a single cloud.” 

Canva’s Anyscale-powered AI platform now delivers:

  • A shared platform for product teams. Product and Operations teams can develop and deploy their own models using the platform, without assistance from the platform team.

  • Distributed training at scale. Canva now easily trains generative and non-generative models across 80 GPUs while elastically scaling data ingest and preprocessing across CPU machines.

  • Optionality. Anyscale’s open platform architecture and ecosystem enable Canva to take advantage of different models, deep learning frameworks, accelerators, data stores, or clouds based on its needs.

  • Control. Training their own models gives the Canva team complete control over model breadth and quality, training data, model evolution, and more.

  • Integration. The Canva team can seamlessly scale data ingest from any data source including Amazon S3 and Snowflake. 

“Anyscale’s open source core matters to us. There’s a community. You can see the code. And we have an automatic exit strategy if we need one, which isn’t possible with proprietary software,” said Roodt.

Canva Magic Write
Canva Magic Write Uses Generative AI to Generate Copy from a Text Prompt

The Impact

Canva has seen a positive impact from its Anyscale deployment, reflecting significant improvements along many dimensions.

  • Up to 12x faster innovation velocity - many models now train 4-6x faster. In one particular Image classification example, training is 12x faster, processing a single epoch previously took 90 minutes and is now 6 minutes.

  • Hardware efficiency - Canva’s GPUs are now fully-saturated at peak load, with no bottlenecks on data ingestion.

“We've been able to innovate quickly and successfully, and user feedback is very positive,” said Roodt. 

More Innovation Ahead

In its desire to deliver the most powerful, comprehensive AI offering in one streamlined platform, Canva plans to continuously expand and improve its generative AI capabilities, both with its own models and partner LLMs. 

The Canva team is in the process of standardizing all AI training on Anyscale for maximum efficiency and velocity while minimizing costs. “We want to shorten the time from training a model to shipping it so we can continuously improve the user experience, and Anyscale is making that possible.”

Canva is constantly evolving its AI platform with ambitious plans for the future. “Our new platform gives us optionality everywhere. Nothing’s off the table and our nimble approach means we can evolve as AI does,” said Roodt. “We have no ceiling on scale, and an incredible opportunity to bring AI features and value to our 170 million users.”