ApX 标志

趋近智

Llama 3.1 8B

参数

8B

上下文长度

131.072K

模态

Text

架构

Dense

许可证

Llama 3.1 Community License

发布日期

23 Jul 2024

知识截止

Dec 2023

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

4096

层数

32

注意力头

32

键值头

8

激活函数

-

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Llama 3.1 8B

The Llama 3.1 8B model is a component of the Meta Llama 3.1 series, a collection of large language models developed by Meta. This model variant, featuring 8 billion parameters, is engineered to serve a range of natural language understanding and generation tasks. Its design prioritizes efficiency and responsiveness, making it suitable for deployment in environments with computational constraints. The model is optimized for dialogue applications and is designed to adhere to complex instructions, supporting its utility in conversational agents and virtual assistant systems.

Architecturally, Llama 3.1 8B is built upon an optimized transformer framework, employing a dense network configuration. A notable innovation is the integration of Grouped-Query Attention (GQA), which enhances inference scalability. The internal mechanics of the model incorporate the SiLU (Swish) activation function and RMSNorm for effective normalization across its layers. Positional encodings are managed through Rotary Position Embedding (RoPE), and the architecture leverages Flash Attention to improve processing speed. The model's training involved a substantial dataset of approximately 15 trillion tokens from publicly available sources, augmented with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align its outputs with desired helpfulness and safety criteria. A significant enhancement in this iteration is the expanded context length, which now extends to 128,000 tokens.

Regarding its capabilities and applications, the Llama 3.1 8B model is proficient in tasks such as text summarization, text classification, and sentiment analysis, particularly in scenarios demanding low-latency inference. Its multilingual support extends to eight languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, facilitating its application in diverse linguistic contexts. The model also supports advanced workflows, including long-form text summarization, and can be utilized in processes such as synthetic data generation and model distillation to refine smaller language models.

关于 Llama 3.1

Llama 3.1 is Meta's advanced large language model family, building upon Llama 3. It features an optimized decoder-only transformer architecture, available in 8B, 70B, and 405B parameter versions. Significant enhancements include an expanded 128K token context window and improved multilingual capabilities across eight languages, refined through data and post-training procedures.


其他 Llama 3.1 模型

评估基准

排名适用于本地LLM。

排名

#53

基准分数排名

Graduate-Level QA

GPQA

0.54

11

0.38

15

0.5

15

0.49

17

0.38

18

Professional Knowledge

MMLU Pro

0.48

23

0.33

30

0.11

31

0.15

31

0.15

32

General Knowledge

MMLU

0.30

34

排名

排名

#53

编程排名

#45

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
64k
128k

所需显存:

推荐 GPU