Meta Llama 3 AI Models With 8B and 70B Parameters Launched, Said to Outperform Google’s Gemini 1.5 Pro

Meta introduced the next generation of its artificial intelligence (AI) models, Llama 3 8B and 70B, on Thursday. Shortened for Large Language Model Meta AI, Llama 3 comes with improved capabilities over its predecessor. The company also adopted new training methods to optimise the efficiency of the models. Interestingly, with Llama 2, the largest model was 70B, but this time the company said its large models will contain more than 400 billion parameters. Notably, a report last week revealed that Meta will unveil its smaller AI models in April and its larger models later in the summer.

Those interested in trying out the new AI models are in luck as Meta is taking a community-first approach with the Llama 3. The new foundation models will be open source just like previous models. Meta stated in its blog post, “Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.”

The list includes all major cloud, hosting, and hardware platforms, which should make it easier for enthusiasts to get their hands on the AI models. Further, Meta has also integrated Llama 3 with its own Meta AI that can be accessed via Facebook Messenger, Instagram, and WhatsApp in supported countries.

Coming to the performance, the social media giant shared benchmark scores of Llama 3 for both its pre-trained and instruct models. For reference, pre-trained is the general conversational AI whereas the instruct models are aimed at completing specific tasks. The pre-trained model of Llama 3 70B outscored Google’s Gemini 1.0 Pro in the MMLU (79.5 vs 71.8), BIG-Bench Hard (81.3 vs 75.0), and DROP (79.7 vs 74.1) benchmarks, wheres the 70B Instruct model outscored the Gemini 1.5 Pro model in MMLU, HumanEval, and GSM-8K benchmarks, based on data shared by the company.

Meta has opted for a decoder-only transformer architecture for the new AI models but has made several improvements over the predecessor. Llama 3 now uses a tokeniser with a vocabulary of 128K tokens, and the company has adopted grouped query attention (GQA) to improve inference efficiency. GQA helps in improving the attention of the AI so it does not move outside of its designated context when answering queries. The social media giant has pre-trained the models with more than 15T tokens, which it claims to have sourced from publicly available data.


Affiliate links may be automatically generated – see our ethics statement for details.

Leave a Comment