ApX logoApX logo

Hunyuan TurboS

Parameters

52B

Context Length

32K

Modality

Text

Architecture

Dense

License

-

Release Date

16 Jul 2025

Knowledge Cutoff

Dec 2024

Technical Specifications

Attention

Attention Structure

Multi-Head Attention

Attention Heads

64

Key-Value Heads

8

Attention Head Dimension

-

Position Embedding

Absolute Position Embedding

RoPE Theta

-

Sliding Window Attention

-

Sliding Window Size

-

Normalization

RMS Normalization

Activation Function

SwigLU

Dimensions

Hidden Dimension Size

5,120

Number of Layers

128

FFN Intermediate Size (Dense)

-

Multi-Token Prediction Heads

-

Tokenizer

Vocabulary Size

-

Hunyuan TurboS

Tencent Hunyuan-TurboS is a high-performance large language model designed to optimize the trade-off between computational efficiency and complex reasoning. By integrating an adaptive long-short Chain-of-Thought (CoT) mechanism, the model dynamically adjusts its cognitive overhead, employing a rapid "fast-thinking" mode for intuitive queries and a more rigorous analytical mode for intricate tasks. This dual-path approach allows the model to deliver near-instantaneous responses for general interactions while maintaining the logical depth required for STEM, coding, and mathematical problem-solving.

Architecturally, Hunyuan-TurboS introduces a hybrid Transformer-Mamba2 Mixture of Experts (MoE) framework, representing an advancement in large-scale state-space model integration. The structure consists of 128 layers organized in an interleaved AMF (Attention-Mamba2-FFN) and MF (Mamba2-FFN) block pattern. This fusion leverages Mamba2 layers to achieve linear scaling for long sequences while utilizing Grouped-Query Attention (GQA) to minimize KV-Cache memory footprints. The model's Feed-Forward Networks (FFN) employ an MoE design with 32 experts, where each token activates a single shared expert and two specialized experts to maintain high capacity with optimized compute.

Built for enterprise-grade scalability, the model supports an ultra-long context window of 256,000 tokens and was pre-trained on a massive corpus of 16 trillion high-quality tokens. Its post-training regime includes supervised fine-tuning on 3 million instructions and a multi-stage reinforcement learning process focused on STEM accuracy and general instruction following. These characteristics make Hunyuan-TurboS well-suited for high-throughput applications such as real-time conversational agents, large-scale document analysis, and sophisticated reasoning tasks where latency and cost-efficiency are paramount.

About Hunyuan

Tencent Hunyuan large language models with various capabilities.


Other Hunyuan Models

Evaluation Benchmarks

Rank

#31

BenchmarkScoreRank

Web Development

WebDev Arena

1383

31

Rankings

Overall Rank

#31

Coding Rank

#43

Model Integrity

Total Score

B

66 / 100

Hunyuan TurboS Model Integrity Report

Total Score

66

/ 100

B

Audit Note

Hunyuan-TurboS demonstrates impressive technical transparency regarding its hybrid architecture and parameter distribution, providing more detail than many proprietary peers. However, it remains opaque concerning training compute resources and specific dataset proportions. The restrictive, geographically-limited license and the lack of full evaluation code for its primary benchmarks represent significant barriers to independent verification and global adoption.

Upstream

22.0 / 30

Architectural Provenance

8.5 / 10

Tencent provides a high level of architectural detail in the official technical report (arXiv:2505.23076). The model is explicitly described as a hybrid Transformer-Mamba2 Mixture of Experts (MoE) model. It specifies a 128-layer structure with a precise interleaved pattern: 57 Mamba2 layers, 7 Attention layers (using Grouped-Query Attention), and 64 FFN layers. The report details the 'AMF' (Attention-Mamba2-FFN) and 'MF' (Mamba2-FFN) block patterns, providing a level of transparency rarely seen in proprietary models.

Dataset Composition

4.5 / 10

The model was pre-trained on a massive corpus of 16 trillion tokens. While the total token count and the use of 3 million instructions for supervised fine-tuning are disclosed, the specific breakdown of the 16T tokens (e.g., percentage of web, code, books) is not provided in detail. The documentation mentions 'high-quality tokens' and 'STEM-specific data' but lacks a granular composition table or public access to data samples, which is common for large-scale industrial models.

Tokenizer Integrity

9.0 / 10

The tokenizer is well-documented and consistent with the Hunyuan-Large model. It features a vocabulary of 128K tokens, consisting of 100K tokens from the tiktoken (OpenAI) base and 28K additional tokens specifically optimized for Chinese language support. Technical metrics such as compression rates (3.13 characters per token) are publicly disclosed, and the tokenizer is accessible via the official GitHub repository for the Hunyuan family.

Model

26.5 / 40

Parameter Density

8.0 / 10

Tencent is transparent about the model's scale, disclosing both total and active parameters. Hunyuan-TurboS has 560 billion total parameters with 56 billion active parameters per token. The MoE structure is detailed as having 32 experts, with a routing strategy that activates 1 shared expert and 2 specialized experts per token. This clear distinction between dense and sparse parameter counts prevents the common 'parameter inflation' marketing trap.

Training Compute

3.5 / 10

Information regarding training compute is limited to high-level infrastructure descriptions. While Tencent mentions its 'Xingmai' high-performance network and the ability to support clusters of over 100,000 GPUs, it does not disclose the specific GPU hours, hardware type (e.g., H100 vs. H800), or the carbon footprint associated with training Hunyuan-TurboS specifically. This lack of specific resource disclosure is a significant gap.

Benchmark Reproducibility

6.0 / 10

The model's performance is documented across 23 automated benchmarks with an average score of 77.9%. While the technical report lists scores for MMLU, GSM8K, and HumanEval, and the model is ranked on the LMSYS Chatbot Arena (#8 globally), the exact evaluation code and full prompt sets used for internal benchmarking are not fully public. However, the release of the 'C3-Bench' and 'ArtifactsBench' by the same team provides some reproducible evaluation frameworks for the broader community.

Identity Consistency

9.0 / 10

The model maintains a consistent identity as part of the Tencent Hunyuan family. It correctly identifies its version (e.g., Hunyuan-TurboS-20250416) and its specific 'fast-thinking' vs. 'slow-thinking' (Hunyuan-T1) capabilities. There are no documented cases of the model claiming to be a competitor's product or misrepresenting its origin during standard interactions.

Downstream

17.0 / 30

License Clarity

5.0 / 10

The licensing situation is complex and restrictive. While some components are released under the 'Tencent Hunyuan Community License,' it contains significant geographic restrictions (e.g., not applicable in the EU, UK, or South Korea) and requires explicit permission for entities with over 100 million monthly active users. This is not a standard open-source license and creates ambiguity for global commercial use.

Hardware Footprint

6.5 / 10

Basic hardware requirements are available through Tencent Cloud documentation and community guides. The model's 56B active parameters suggest a high VRAM requirement (estimated ~112GB for FP16), and the use of GQA and Mamba2 layers is explicitly noted as a strategy to reduce KV-cache memory footprint. However, official documentation lacks a comprehensive quantization-to-accuracy tradeoff table for consumer-grade hardware.

Versioning Drift

5.5 / 10

Tencent uses date-based versioning (e.g., 20250416) and provides high-level changelogs during major updates (e.g., May 2025 upgrade). However, as a primarily API-driven model, silent updates and behavioral drift are difficult for users to track independently. There is no public, granular version history that allows users to pin specific weights for long-term consistency.

Hunyuan TurboS: Model Specifications and Details