We are excited to announce collaboration between Meta and Anyscale to bolster the Llama ecosystem. Learn more from Joe Spisak at Ray Summit.
Llama, and Llama-2 specifically, is a family of LLMs publicly released by Meta ranging from 7B to 70B parameters, which outperform other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests. However, Llama-2 is far more than just a suite of models. It’s a platform that the AI community has embraced, with hundreds of derivatives and an emerging ecosystem across academia and industry.
Anyscale provides seamless access to the Llama-2 models via Anyscale Endpoints, an OpenAI-compatible LLM inference API for open models. Anyscale Endpoints enables AI application developers to easily swap closed models for the Llama-2 models or to use open models along with closed models in the same application.
The Llama models, along with Anyscale Endpoints, promise to bring high-quality cost-efficient LLM inference to a broad range of application developers.
Cost efficiency is top of mind for many AI application builders. For applications that do not require large and expensive general-purpose models, smaller fine-tuned models offer a promising path to cost-efficient LLM inference. Recent work on LLM fine-tuning shows that even the smallest Llama-2 model can outperform GPT-4 when fine-tuned on some problems like SQL query generation. While the degree of the benefit is problem dependent, fine-tuning helps across the board and suggests that fine tuning will play an important role in improving model quality while maintaining speed and cost efficiency.
Joe Spisak, who leads the work on generative AI open source and the Llama models at Meta will be sharing more at the Ray Summit, September 18-20 in San Francisco. Sessions details are here. We look forward to seeing you in SF!