Meta has announced the release of the first two models of its next-generation Llama language model series, Llama 3. The release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters.
To develop Llama 3, Meta says it focused on four key aspects: model architecture, pretraining data, scaling up pretraining, and instruction fine-tuning. The models use a standard decoder-only transformer architecture with a 128K token vocabulary and are pretrained on over 15T tokens collected from publicly available sources.
Meta has also developed trust and safety tools, including updated components with Llama Guard 2, Cybersec Eval 2, and the introduction of Code Shield, an inference time guardrail for filtering insecure code produced by LLMs.
The Llama 3 models will be available on major platforms, including cloud providers and model API providers like AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, Nvidia NIM, and Snowflake, Meta said in a statement. Hardware platforms supporting Llama 3 include AMD, AWS, Dell, Intel, Nvidia, and Qualcomm.
Meta’s largest models, currently in training, are over 400B parameters and are expected to offer new capabilities such as multimodality, multilingual conversation, and longer context windows.
Meta has integrated the Llama 3 models into its AI assistant, Meta AI, which is now available in more countries across its apps, including Facebook, Instagram, WhatsApp, Messenger, and the web. The models can be downloaded from the Llama 3 website, and a Getting Started Guide is available for reference.
[Image courtesy: Meta]
Updated: April 23, 2024