趋近智
参数
70B
上下文长度
130K
模态
Text
架构
Dense
许可证
Llama 3.3 Community License
发布日期
7 Dec 2024
知识截止
Dec 2023
注意力结构
Grouped-Query Attention
隐藏维度大小
8192
层数
80
注意力头
64
键值头
8
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
ROPE
不同量化方法和上下文大小的显存要求
The Meta Llama 3.3 70B is a large language model engineered for text-based generative applications. It operates as a dense Transformer model, incorporating an optimized architectural design. This model variant is specifically instruction-tuned for dialogue, demonstrating proficiency in multilingual chat scenarios, code assistance, and synthetic data generation. Its development involved extensive pretraining on approximately 15 trillion tokens sourced from publicly available online datasets.
From an architectural perspective, Llama 3.3 70B integrates Grouped-Query Attention (GQA) to enhance inference scalability and efficiency. The model's training regimen includes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), which are applied to align its outputs with human preferences for helpfulness and safety. A notable feature is its extended context window, supporting up to 130,000 tokens, enabling the processing and generation of longer text sequences for advanced use cases such as long-form summarization and complex multi-turn conversations.
The model is equipped with capabilities for multilingual inputs and outputs, encompassing languages such as English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Furthermore, it supports tool-use, providing developers with the ability to extend its functionality via custom function definitions and integration with third-party services. This design emphasizes efficiency and aims to reduce hardware requirements, thereby increasing the accessibility of high-quality AI for various applications.
Meta's Llama 3.3 is a 70 billion parameter, multilingual large language model. It utilizes an optimized transformer architecture, incorporating Grouped-Query Attention for enhanced inference efficiency. The model features an extended 128k token context window and is designed to support quantization, facilitating deployment on varied hardware configurations.
排名适用于本地LLM。
排名
#27
基准 | 分数 | 排名 |
---|---|---|
Refactoring Aider Refactoring | 0.59 | 6 |
StackEval ProLLM Stack Eval | 0.85 | 9 |
Coding Aider Coding | 0.59 | 10 |
QA Assistant ProLLM QA Assistant | 0.9 | 11 |
Summarization ProLLM Summarization | 0.68 | 11 |
Professional Knowledge MMLU Pro | 0.69 | 12 |
Graduate-Level QA GPQA | 0.51 | 14 |
Coding LiveBench Coding | 0.52 | 16 |
Data Analysis LiveBench Data Analysis | 0.49 | 22 |
General Knowledge MMLU | 0.51 | 22 |
Reasoning LiveBench Reasoning | 0.33 | 23 |
Mathematics LiveBench Mathematics | 0.41 | 23 |