ERNIE-4.5-300B-A47B-Base: Specifications and GPU VRAM Requirements

ERNIE-4.5-300B-A47B-Base

Open Source

Open Weights

Active Parameters

300B

Context Length

131.072K

Modality

Text

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

30 Jun 2025

Knowledge Cutoff

Jun 2025

Technical Specifications

Total Expert Parameters

47.0B

Number of Experts

Active Experts

Attention Structure

Grouped-Query Attention

Hidden Dimension Size

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

GELU

Normalization

Layer Normalization

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

ERNIE-4.5-300B-A47B-Base

The ERNIE 4.5 model family, developed by Baidu, represents a new generation of large-scale foundation models. This family includes ten distinct variants, designed to integrate and process diverse input modalities such as text, image, and video, while primarily generating text outputs. The ERNIE-4.5-300B-A47B-Base variant functions as a large language model within this family, optimized for advanced reasoning and high-quality text generation tasks. Its capabilities extend to comprehensive language understanding and generation, supporting a broad spectrum of applications.

Central to the ERNIE 4.5 architecture is a multimodal heterogeneous Mixture-of-Experts (MoE) structure. This design enables efficient parameter sharing across various modalities, including self-attention and expert parameters, while also incorporating dedicated parameters for distinct modalities such as text and vision. This architectural approach is engineered to enhance multimodal understanding without compromising performance on tasks strictly involving text. Key innovations within this framework include "FlashMask" Dynamic Attention Masking and a modality-isolated routing technique, which contribute to improved efficiency and performance. The models are trained using the PaddlePaddle deep learning framework, leveraging techniques such as intra-node expert parallelism, memory-efficient pipeline scheduling, FP8 mixed-precision training, and fine-grained recomputation methods to ensure optimal efficiency.

The ERNIE-4.5-300B-A47B-Base model supports long-context processing, accommodating sequence lengths up to 131,072 tokens. This enables it to handle extensive textual inputs for complex reasoning and generation tasks. Its Mixture-of-Experts architecture is tailored for efficient scaling and delivers high-throughput inference across various hardware configurations. This model is well-suited for general-purpose large language model applications that require robust reasoning capabilities and high processing speed. Developers can further adapt and fine-tune the model for specific application requirements using associated toolkits like ERNIEKit, which supports methodologies such as Supervised Fine-Tuning (SFT), Low-Rank Adaptation (LoRA), and Direct Preference Optimization (DPO).

About ERNIE 4.5

The Baidu ERNIE 4.5 family consists of ten large-scale multimodal models. They utilize a heterogeneous Mixture-of-Experts (MoE) architecture, which enables parameter sharing across modalities while also employing dedicated parameters for specific modalities, supporting efficient language and multimodal processing.

Other ERNIE 4.5 Models

Evaluation Benchmarks

Ranking is for Local LLMs.

No evaluation benchmarks for ERNIE-4.5-300B-A47B-Base available.

Rankings

Overall Rank

Coding Rank

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

64k

128k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Release Notes Read the Paper Download Weights Source Code