Llama Family
The Family for Llama Models, Technologies, and Enthusiasts
Let’s Build Llama Community Together!
Partner
Mission
Promoting the Development of Artificial General Intelligence through Open Source
The open-sourcing of the Llama model has undoubtedly significantly accelerated the advancement of large-scale model technology. We are dedicated to building an open platform where developers and technology enthusiasts from all walks of life can collaborate to create an open-source ecosystem for Llama. Spanning from large-scale models to smaller ones, from text to multimodal capabilities, and from software to hardware algorithm optimizations, we hope that open-source can bring the benefits of AI to all of humanity. In this era of technological explosion, join the Llama Family, progress alongside technology, move forward with the community, and together, stride towards AGI (Artificial General Intelligence)!
Activity
Compute
GPU Source
GeForce RTX 30 Series
GeForce RTX 40 Series
coming soon
NVIDIA H100 Tensor Core GPU
coming soon
NVIDIA A100 Tensor Core GPU
coming soon
...
Computing
Cooperation
Model
Meta Llama
The Llama model, open-sourced by Meta, is currently the most widely used large model in both the industry and academia. The latest version of the model has been trained on 2.0T tokens, with parameter sizes of 7B, 13B, 34B, and 70B. It includes both the base model and models fine-tuned with instructions.
Model
Training Data
Params
Tokens
LLaMA
English CommonCrawl, C4, Github, Wikipedia, Gutenberg and Books3, ArXiv, Stack Exchange
7B
1.0T
13B
1.0T
33B
1.4T
65B
1.4T
Llama 2
A new mix of publicly available online data
7B
2.0T
13B
2.0T
34B
2.0T
70B
2.0T
Llama 3
Collected from publicly available sources, over 5% of the Llama 3 pretraining dataset consists of high-quality non-English data that covers over 30 languages
8B
15.0T
70B
15.0T
405B
15.0T
Code Llama
Code Llama is trained on top of Llama2 using code data and is categorized into three types: Base Model, Python Model, and Instruct Model, with parameter sizes of 7B, 13B, 34B, and 70B. It is capable of code continuation, code filling, and instruction-based programming.
Model
Training Data
Params
Type
Code Llama
Based on Llama 2, trained using a public code dataset of 500B tokens. To help the model retain natural language understanding skills, 8% of the sample data comes from natural language datasets related to code, containing discussions about code and snippets within natural language questions or answers.
7B
Base Model
a foundational model for code generation tasks
Python
a version specialized for Python
Instruct
a fine-tuned version with human instructions and self-instruct code synthesis data
13B
34B
70B
Atom
Atom, developed jointly by AtomEcho and Llama Family, is based on the Llama architecture. It is trained on 2.7T of Chinese and multilingual corpora, with parameter sizes including 1B, 7B, and 13B. Atom has significantly enhanced the Chinese language capabilities of the Llama model.
Model
Training Data
Params
Tokens
Atom
Chinese and multilingual encyclopedias, books, blogs, news, novels, financial data, legal data, medical data, code, paper, Chinese natural language processing competition datasets, etc.
1B
2.7T
7B
2.7T
13B
2.7T
Llama Family