趋近智
参数
52B
上下文长度
32K
模态
Text
架构
Dense
许可证
-
发布日期
16 Jul 2025
训练数据截止日期
Dec 2024
注意力结构
Multi-Head Attention
隐藏维度大小
5120
层数
128
注意力头
64
键值头
8
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Tencent Hunyuan-TurboS is a high-performance large language model designed to optimize the trade-off between computational efficiency and complex reasoning. By integrating an adaptive long-short Chain-of-Thought (CoT) mechanism, the model dynamically adjusts its cognitive overhead, employing a rapid "fast-thinking" mode for intuitive queries and a more rigorous analytical mode for intricate tasks. This dual-path approach allows the model to deliver near-instantaneous responses for general interactions while maintaining the logical depth required for STEM, coding, and mathematical problem-solving.
Architecturally, Hunyuan-TurboS introduces a hybrid Transformer-Mamba2 Mixture of Experts (MoE) framework, representing an advancement in large-scale state-space model integration. The structure consists of 128 layers organized in an interleaved AMF (Attention-Mamba2-FFN) and MF (Mamba2-FFN) block pattern. This fusion leverages Mamba2 layers to achieve linear scaling for long sequences while utilizing Grouped-Query Attention (GQA) to minimize KV-Cache memory footprints. The model's Feed-Forward Networks (FFN) employ an MoE design with 32 experts, where each token activates a single shared expert and two specialized experts to maintain high capacity with optimized compute.
Built for enterprise-grade scalability, the model supports an ultra-long context window of 256,000 tokens and was pre-trained on a massive corpus of 16 trillion high-quality tokens. Its post-training regime includes supervised fine-tuning on 3 million instructions and a multi-stage reinforcement learning process focused on STEM accuracy and general instruction following. These characteristics make Hunyuan-TurboS well-suited for high-throughput applications such as real-time conversational agents, large-scale document analysis, and sophisticated reasoning tasks where latency and cost-efficiency are paramount.
Tencent Hunyuan large language models with various capabilities.
排名
#27
| 基准 | 分数 | 排名 |
|---|---|---|
Web Development WebDev Arena | 1383 | 23 |