ApX logoApX logo

Hunyuan TurboS

Parameters

52B

Context Length

32K

Modality

Text

Architecture

Dense

License

-

Release Date

16 Jul 2025

Knowledge Cutoff

Dec 2024

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

5120

Number of Layers

128

Attention Heads

64

Key-Value Heads

8

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

Hunyuan TurboS

Tencent Hunyuan-TurboS is a high-performance large language model designed to optimize the trade-off between computational efficiency and complex reasoning. By integrating an adaptive long-short Chain-of-Thought (CoT) mechanism, the model dynamically adjusts its cognitive overhead, employing a rapid "fast-thinking" mode for intuitive queries and a more rigorous analytical mode for intricate tasks. This dual-path approach allows the model to deliver near-instantaneous responses for general interactions while maintaining the logical depth required for STEM, coding, and mathematical problem-solving.

Architecturally, Hunyuan-TurboS introduces a hybrid Transformer-Mamba2 Mixture of Experts (MoE) framework, representing an advancement in large-scale state-space model integration. The structure consists of 128 layers organized in an interleaved AMF (Attention-Mamba2-FFN) and MF (Mamba2-FFN) block pattern. This fusion leverages Mamba2 layers to achieve linear scaling for long sequences while utilizing Grouped-Query Attention (GQA) to minimize KV-Cache memory footprints. The model's Feed-Forward Networks (FFN) employ an MoE design with 32 experts, where each token activates a single shared expert and two specialized experts to maintain high capacity with optimized compute.

Built for enterprise-grade scalability, the model supports an ultra-long context window of 256,000 tokens and was pre-trained on a massive corpus of 16 trillion high-quality tokens. Its post-training regime includes supervised fine-tuning on 3 million instructions and a multi-stage reinforcement learning process focused on STEM accuracy and general instruction following. These characteristics make Hunyuan-TurboS well-suited for high-throughput applications such as real-time conversational agents, large-scale document analysis, and sophisticated reasoning tasks where latency and cost-efficiency are paramount.

About Hunyuan

Tencent Hunyuan large language models with various capabilities.


Other Hunyuan Models

Evaluation Benchmarks

Rank

#27

BenchmarkScoreRank

Web Development

WebDev Arena

1383

23

Rankings

Overall Rank

#27

Coding Rank

#32