Natural Language Processing with Ray

Achieving state-of-the-art natural language processing requires compute at unprecedented scale.

Quickly train deep learning models in language at any scale

More than data, the compute necessary to properly train, tune and serve an effective NLP model can be massive — more than 5x increase every year.

Consider that RoBERTa, a robust technique to pretrain NLP models, uses no fewer than 17 hyperparameters. Assuming a minimal two values per hyperparameter, the search space consists of over 130K configurations, and exploring these spaces, even partially, can require vast computational resources.

This is why a scalable compute platform is necessary to enable better, more efficient NLP models that are effectively optimized to deliver the best results.

The right way to scale for NLP

Train, test, deploy, serve and monitor machine learning models efficiently and quickly with Ray and Anyscale.

  • Scale with a click

    Rely on a robust infrastructure that can scale up machine learning workflows as needed. Scale everything from XGBoost to Python to TensorFlow to Scikit-learn on top of Ray.

  • An open, broad ecosystem

    Have access to the most up-to-date technologies and their communities, don’t limit what libraries or packages you can use for your models. Load data from Snowflake, Databricks, or S3. Track your experiments with Weights & Balances or MLFlow. Or monitor your production services with Grafana. Don’t limit yourself.

“We are using spacy.io to process our NLP pipeline. What would take 40 hours to process 15M documents now takes 3 hours using Anyscale. We did not consider other solutions such as databricks as it would require substantial refactoring and a steep learning curve.”

NLP startup

CTO

“We were able to replace our complex setup using Celery and containers to now just using Ray and Anyscale. Anyscale allows us to auto-scale, reduce our Ops need without major changes to our legacy code.”

Legal analytics company

MLE

The magic of Merlin

Learn how Shopify granted its machine learning platform Merlin the magic to help data scientists and ML engineers streamline their machine learning workflows.

Already using open source Ray?

Experience the power of infinite scale and eliminate complex production operations.