ApX 标志ApX 标志

趋近智

Hunyuan TurboS

参数

52B

上下文长度

32K

模态

Text

架构

Dense

许可证

-

发布日期

16 Jul 2025

训练数据截止日期

Dec 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

128

注意力头

64

键值头

8

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Hunyuan TurboS

Tencent Hunyuan-TurboS is a high-performance large language model designed to optimize the trade-off between computational efficiency and complex reasoning. By integrating an adaptive long-short Chain-of-Thought (CoT) mechanism, the model dynamically adjusts its cognitive overhead, employing a rapid "fast-thinking" mode for intuitive queries and a more rigorous analytical mode for intricate tasks. This dual-path approach allows the model to deliver near-instantaneous responses for general interactions while maintaining the logical depth required for STEM, coding, and mathematical problem-solving.

Architecturally, Hunyuan-TurboS introduces a hybrid Transformer-Mamba2 Mixture of Experts (MoE) framework, representing an advancement in large-scale state-space model integration. The structure consists of 128 layers organized in an interleaved AMF (Attention-Mamba2-FFN) and MF (Mamba2-FFN) block pattern. This fusion leverages Mamba2 layers to achieve linear scaling for long sequences while utilizing Grouped-Query Attention (GQA) to minimize KV-Cache memory footprints. The model's Feed-Forward Networks (FFN) employ an MoE design with 32 experts, where each token activates a single shared expert and two specialized experts to maintain high capacity with optimized compute.

Built for enterprise-grade scalability, the model supports an ultra-long context window of 256,000 tokens and was pre-trained on a massive corpus of 16 trillion high-quality tokens. Its post-training regime includes supervised fine-tuning on 3 million instructions and a multi-stage reinforcement learning process focused on STEM accuracy and general instruction following. These characteristics make Hunyuan-TurboS well-suited for high-throughput applications such as real-time conversational agents, large-scale document analysis, and sophisticated reasoning tasks where latency and cost-efficiency are paramount.

关于 Hunyuan

Tencent Hunyuan large language models with various capabilities.


其他 Hunyuan 模型

评估基准

排名

#27

基准分数排名

Web Development

WebDev Arena

1383

23

排名

排名

#27

编程排名

#32