趋近智
参数
300M
上下文长度
131.072K
模态
Text
架构
Dense
许可证
Apache 2.0
发布日期
30 Jun 2025
训练数据截止日期
Dec 2024
注意力结构
Multi-Head Attention
隐藏维度大小
1024
层数
18
注意力头
16
键值头
2
激活函数
Swish
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
The ERNIE-4.5-0.3B model is a high-efficiency transformer designed to serve as the compact entry point of Baidu's ERNIE 4.5 model family. Engineered for low-latency inference and high-throughput environments, this model prioritizes linguistic proficiency in both Chinese and English while minimizing the computational overhead typical of large-scale foundation models. Its design philosophy balances the need for deep language understanding with the operational realities of edge computing and mobile deployment, providing a versatile solution for real-time text processing.
Technically, ERNIE-4.5-0.3B utilizes a dense transformer architecture featuring 18 layers and a hidden dimension size of 1024. Unlike its larger Mixture-of-Experts counterparts in the same family, this variant activates all its parameters for every token, ensuring consistent performance characteristics and simplified deployment workflows. The model incorporates Grouped-Query Attention (GQA) with 16 query heads and 2 key-value heads to optimize memory usage and speed during long-context generation. It supports an expansive context window of 131,072 tokens, allowing it to process substantial documents and maintain coherence over long-range sequences.
From a performance perspective, ERNIE-4.5-0.3B is optimized for high-speed text completion, sentiment analysis, and on-device conversational agents. It integrates advanced training methodologies from the broader ERNIE 4.5 project, including RMS Normalization and the Swish (SiLU) activation function, which contribute to its training stability and representational power. The model is fully compatible with modern inference engines like vLLM and FastDeploy, and it is released under the Apache 2.0 license to facilitate both academic research and commercial application development within the open-source ecosystem.
The Baidu ERNIE 4.5 family consists of ten large-scale multimodal models. They utilize a heterogeneous Mixture-of-Experts (MoE) architecture, which enables parameter sharing across modalities while also employing dedicated parameters for specific modalities, supporting efficient language and multimodal processing.
没有可用的 ERNIE-4.5-0.3B 评估基准。