Active Parameters
52B
Context Length
30K
Modality
Text
Architecture
Mixture of Experts (MoE)
License
Tencent Hunyuan Community License Agreement
Release Date
10 Jun 2024
Knowledge Cutoff
-
Total Expert Parameters
389.0B
Number of Experts
17
Active Experts
2
Attention Structure
Multi-Head Attention
Hidden Dimension Size
6400
Number of Layers
64
Attention Heads
80
Key-Value Heads
8
Activation Function
SwigLU
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
Tencent Hunyuan-Large, identified as Hunyuan-MoE-A52B, is a large Transformer-based Mixture-of-Experts (MoE) model developed and open-sourced by Tencent. This model addresses the computational challenges associated with extensive parameter counts in large language models by employing a dynamic routing strategy. It is engineered to deliver high performance across a spectrum of natural language processing tasks, while optimizing resource utilization through its sparse activation mechanism. The model's design facilitates its application in diverse intelligent systems, supporting advancements in AI research and deployment .
The technical architecture of Hunyuan-Large incorporates a total of 389 billion parameters, with only 52 billion parameters actively utilized during inference, a characteristic of its Mixture-of-Experts design . The model structure includes one shared expert and 16 specialized experts, with one specialized expert activated per token, in addition to the continuously active shared expert . Positional encoding is managed using Rotary Position Embedding (RoPE), and the activation function is SwiGLU . To enhance inference efficiency and mitigate the memory footprint of the KV cache, Hunyuan-Large integrates Grouped-Query Attention (GQA) and Cross-Layer Attention (CLA), leading to a substantial reduction in KV cache memory consumption . The training regimen also benefits from high-quality synthetic data, an expert-specific learning rate scaling methodology, and the integration of Flash Attention for accelerated training processes .
Hunyuan-Large supports an extensive context window of up to 256,000 tokens in its pre-trained variant, enabling the processing and comprehension of lengthy textual inputs for applications such as detailed document analysis and extensive codebases . The model has demonstrated competitive performance across various benchmarks in both English and Chinese, including MMLU, MMLU-Pro, CMMLU, GSM8K, and MATH datasets, frequently exceeding the performance of dense models and other MoE models with comparable active parameter sizes . These capabilities position Hunyuan-Large as a suitable solution for demanding tasks requiring advanced reasoning, comprehensive content generation, and sophisticated understanding of long-form text .
Tencent Hunyuan large language models with various capabilities.
Ranking is for Local LLMs.
No evaluation benchmarks for Hunyuan Standard available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens