Active Parameters
300B
Context Length
131.072K
Modality
Text
Architecture
Mixture of Experts (MoE)
License
Apache 2.0
Release Date
30 Jun 2025
Knowledge Cutoff
-
Total Expert Parameters
47.0B
Number of Experts
64
Active Experts
8
Attention Structure
Grouped-Query Attention
Hidden Dimension Size
-
Number of Layers
54
Attention Heads
64
Key-Value Heads
8
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
ERNIE-4.5-300B-A47B is a foundational language model within Baidu's ERNIE 4.5 family, designed to support advanced natural language processing tasks. While the broader ERNIE 4.5 series encompasses multimodal capabilities, this specific variant focuses on text-only applications, optimizing its architecture for efficient and robust language understanding and generation. Its primary purpose is to serve as a high-performance solution for general-purpose textual analysis and creation, including complex reasoning and knowledge-intensive tasks. The model supports text generation in both English and Chinese.
The model's technical foundation is a Mixture-of-Experts (MoE) architecture, featuring a total of 300 billion parameters with 47 billion parameters actively engaged per token during inference. The overarching ERNIE 4.5 MoE design includes a novel heterogeneous structure that facilitates parameter sharing while also allowing for dedicated parameters across different modalities, optimizing for multimodal understanding without compromising text-related performance. Key architectural enhancements include concepts like Dynamic Attention Masking (FlashMask), which contributes to efficient information processing, and modality-isolated routing. The model is trained using Baidu's PaddlePaddle deep learning framework, employing advanced techniques such as intra-node expert parallelism, memory-efficient pipeline scheduling, and FP8 mixed-precision training to achieve high throughput during pre-training.
For deployment and operational efficiency, ERNIE-4.5-300B-A47B supports highly efficient inference through methods like multi-expert parallel collaboration and convolutional code quantization, enabling near-lossless 4-bit and 2-bit quantization for diverse hardware configurations. It maintains a substantial context length of 131,072 tokens, allowing for the processing of extensive textual inputs and enabling coherent, long-form content generation. The model is also designed to be fine-tuned and deployed with developer toolkits like ERNIEKit and FastDeploy, making it accessible for a range of commercial and research applications under the Apache 2.0 license.
The Baidu ERNIE 4.5 family consists of ten large-scale multimodal models. They utilize a heterogeneous Mixture-of-Experts (MoE) architecture, which enables parameter sharing across modalities while also employing dedicated parameters for specific modalities, supporting efficient language and multimodal processing.
Ranking is for Local LLMs.
No evaluation benchmarks for ERNIE-4.5-300B-A47B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens