Home BlogBlog Detail

Anyscale Endpoints: JSON Mode, Function calling, New models: Llama Guard and Mistral-7B-OpenOrca

By Endpoints Team | December 12, 2023

Anyscale Endpoints is the first LLM APIs providing a wide range of capabilities to empower developers to build their applications not just from serving and fine tuning LLMs, but also leveraging embedding services and function calling.

Update June 2024: Anyscale Endpoints (Anyscale's LLM API Offering) and Private Endpoints (self-hosted LLMs) are now available as part of the Anyscale Platform. Click here to get started on the Anyscale platform.

Check out the cookbook for examples and common recipes.

LinkJSON Mode and Function Calling with Mistral 7B (public preview)

Unlock the potential of function calling, enabling integration with external tools, APIs, databases, and other LLMs. The JSON mode feature enables interaction with other tools via structured output data.

LinkModel: Llama Guard Model (Public Preview)

Meta’s newest model Llama Guard ensures that your prompts and responses are safe by providing content moderation and guardrails. This model is based on Llama2 and you can get started by clicking here.

LinkModel: Mistral-7B-OpenOrca

We are adding support for Mistral-7B-OpenOrca. This Mistral based 7B mode scores better overall than all other models below 30B. This model demonstrated 98% of Llama-70B-chat's performance. You can get stated by clicking here.

Previous announcements:
Anyscale Endpoints: Embedding endpoint, Llama-2 70B fine-tuning and improved sign-up experience

JSON Mode and Function Calling with Mistral 7B (public preview)
Model: Llama Guard Model (Public Preview)
Model: Mistral-7B-OpenOrca

Sharing

Sign up for product updates

Ray on Alibaba Cloud: Building an ML Platform

An Open Source Stack for AI Compute: Kubernetes + Ray + PyTorch + vLLM

Building Scalable RAG Pipelines with Ray and Anyscale

Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.