Llama on a field.
Editorial

Meta’s AI Trilogy: Llama 3.1 Packs a 405B Punch

3 minute read
Pierre DeBois avatar
By
SAVED
Meta’s Llama 3.1 is here, and it’s a 405B behemoth. See how it could change the AI game.

The Gist

  • Factuality focus. Meta designed Llama 3.1 to reduce AI hallucinations, enhancing the model's reliability.
  • Steerability upgrade. Llama 3.1 offers developers more control, ensuring AI models meet specific needs.
  • Token capacity boost. Expanded 128k tokens in Llama 3.1 improve document handling and response precision.

Meta has worked to keep the darting AI eyes on its AI model Llama. I previously reported on Llama 2. I also followed the LLama story as Meta announced LLama 3 in April, featuring two variants, the 8B and 70B models.

Well, everyone loves a movie trilogy. Meta’s “third movie” comes in the form of Llama 3.1, a follow-up AI model offering the capability and features promised in Llama 2 development testing. The story includes the most significant upgrade in Llama, the launch of 405B. Lllama 3.1 405B is the largest model ever released, placing Meta in a unique leading position against Google’s Gemini and OpenAI’s ChatGPT.

What Meta Brought to Llama 3’s Development

Think of 405B as an analogy to a big block V-8 in a small compact car. Meta’s blog proclaims Llama 3.1 405B as being in “a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed-source models. Our new model will enable the community to unlock new workflows, such as synthetic data generation and model distillation.”

Performance improvements were added to all the LLama 3.1 models so that developers can experience a similar model quality regardless of the size chosen.

Meta released a white paper explaining how it created the improvements in Llama 3.1 and tested it against other foundation models.

All of the Llama 3.1 models received the same training methodology. 

LLama 3.1 was trained on a dataset of web data, with great care applied for deduplication and for removing Personal Identifiable Information (PII). Special efforts were made to ensure performance characteristics such as math and reasoning, code creation and multilingualism were robust.

Related Article: Did Meta Just Top ChatGPT With Its Release of Llama 3's Meta AI?

Enhancing Factuality and Steerability for Marketers Building Custom Models

The Meta research team also focused on two model characteristics, factuality and steerability, that marketers planning to build an AI model should pay particular attention. Marketers should also note Llama 3.1 comes with expanded tokens.

Meta’s Approach to Reducing AI Hallucinations through Factuality

Factuality is meant to reduce the possibility of hallucinations. Meta approached this by developing a knowledge probing technique with a bias toward what the model “knows.” The post-training was designed such that the model aligns with relying on “knowing what it knows” before seeking additional information to refine a response. This is a kind of sanity check before adding information that changes a good response into one containing a hallucinated truth.

How Meta Enhances Steerability for Custom AI Model Development

Steerability is the ability to direct a model’s actions and outcomes so that they match to a given developer and user specifications. In simple terms, it means the foundation model in Llama should act as a basic template for the training data developers add. The added training data instructs the model for the developer’s purpose, much like a training manual instructs a smart but unskilled employee who is ready to work. Focusing on steerability was Meta’s way to ensure that Llama 3.1 is “maximally steerable” to serve the different downstream use cases for which developers want to build AI models.

Expanded Token Capacity Enhances Llama 3.1’s Precision

Responding to the feedback on version 3, Meta added an expanded 128k token capacity, which increases the ability to manage documents and media alongside prompts and craft precise responses to those prompts.

The white paper also reveals evaluation results of comparisons between Llama3 and various other models of comparable sizes. The results indicated better model performance in a number of benchmark categories such as code creation, commonsense reasoning, long context, problem-solving and adversarial metrics. The pre-training results, for example, reveal that Llama 3 8B outperforms competing models in virtually every category.

Learning Opportunities

Related Article: How Meta's Llama 2 Shifts Marketing's Relationship With AI

The AI World Opens up for Developers and Marketers

Llama 3.1 can spark imagination for what use cases AI can be applied and explored. Llama 3.1 is available on a number of AI development platforms, such as Groq, HuggingChat and AWS. This gives developers a number of options to explore model development according to their needs. Users can explore analysis that leads to insights by uploading a dataset, analyzing it and prompting graphs that can explain the discovered insights. Multilingual applications can help translate large documents quickly.

Meta believes the Llama 3.1 release will usher in a new ecosystem of developers who can create AI applications. This potential also appeals to marketing managers who want to develop customer-centric AI solutions but want their development based upon a reliable framework. Given that Meta has contributed to many popular open-source programming language frameworks such as React, the choice to open-source Llama is the next chapter of that contribution policy.

The arrival of open-source availability is a chapter marketers should read very closely in the near future.

fa-solid fa-hand-paper Learn how you can join our contributor community.

About the Author
Pierre DeBois

Pierre DeBois is the founder and CEO of Zimana, an analytics services firm that helps organizations achieve improvements in marketing, website development, and business operations. Zimana has provided analysis services using Google Analytics, R Programming, Python, JavaScript and other technologies where data and metrics abide. Connect with Pierre DeBois:

Main image: andreanord80
Featured Research