Parameters
70B
Context Length
32K
Modality
Text
Architecture
Dense
License
-
Release Date
22 Aug 2025
Knowledge Cutoff
Dec 2024
Attention
Attention Structure
Multi-Head Attention
Attention Heads
-
Key-Value Heads
-
Attention Head Dimension
-
Position Embedding
Absolute Position Embedding
RoPE Theta
-
Sliding Window Attention
-
Sliding Window Size
-
Normalization
RMS Normalization
Activation Function
SwigLU
Dimensions
Hidden Dimension Size
-
Number of Layers
128
FFN Intermediate Size (Dense)
-
Multi-Token Prediction Heads
-
Tokenizer
Vocabulary Size
-
Tencent Hunyuan T1 is a high-performance reasoning model engineered for deep analytical tasks, logical problem-solving, and advanced scientific inquiry. It serves as the primary 'slow-thinking' reasoning engine within the Hunyuan ecosystem, designed to compete with state-of-the-art models by prioritizing structured logic and long-form consistency. The model is built upon the TurboS base, which represents a significant architectural shift toward integrating state-space models into large-scale production environments for enhanced computational efficiency.
The technical foundation of Hunyuan T1 is a Hybrid-Transformer-Mamba Mixture of Experts (MoE) architecture. This design incorporates Transformer blocks for global contextual awareness alongside Mamba-2 state-space layers, which provide linear scaling and superior memory efficiency for sequence modeling. The model utilizes a total of 16 experts, with dynamic routing that activates a subset of approximately 52 billion parameters per token. This hybrid approach is specifically engineered to mitigate the quadratic complexity of traditional attention mechanisms, allowing the model to handle context lengths of up to 256,000 tokens while maintaining a decoding speed approximately twice as fast as comparable dense Transformer models.
Operationally, Hunyuan T1 is optimized through a post-training regimen that heavily emphasizes large-scale reinforcement learning, with over 96% of compute resources dedicated to this phase. It employs curriculum learning to incrementally scale reasoning complexity and uses Cross-Layer Attention (CLA) to further reduce memory overhead during inference. These innovations make it particularly well-suited for enterprise-level tasks such as complex code generation, mathematical theorem proving, and multi-step logical deduction where high precision and reduced context loss are paramount.
Tencent Hunyuan large language models with various capabilities.
Rank
#30
| Benchmark | Score | Rank |
|---|---|---|
Web Development WebDev Arena | 1387 | 27 |
Overall Rank
#30
Coding Rank
#41
Total Score
62
/ 100
Hunyuan T1 exhibits a strong technical foundation with significant transparency regarding its hybrid MoE-Mamba architecture and parameter density. However, the profile is weakened by a restrictive custom license and a lack of granular detail concerning training data composition and total compute resources. While benchmark performance is well-documented, the reliance on proprietary evaluation frameworks limits full independent reproducibility.
Architectural Provenance
Tencent provides a detailed technical description of the 'TurboS' base architecture, identifying it as a Hybrid-Transformer-Mamba Mixture of Experts (MoE) model. Documentation specifies the integration of Mamba-2 state-space layers with Transformer blocks to achieve linear scaling for its 256,000 token context window. While the high-level design is well-documented in technical reports and blog posts, specific layer-by-layer configurations and the exact 'TurboS' pre-training recipe remain partially proprietary.
Dataset Composition
Disclosure is limited to general categories such as mathematics, logic, coding, and PhD-level scientific problems. While Tencent mentions a 'curriculum learning' approach and the use of 'ground-truth feedback' for reinforcement learning, it lacks a granular percentage breakdown of the training corpus. There is no public access to sample data or a detailed audit of the filtering and cleaning methodologies used for the T1 variant specifically.
Tokenizer Integrity
The model utilizes a tokenizer based on the tiktoken framework with a vocabulary of approximately 129,000 tokens (100K standard plus 29K Chinese-specific tokens). Technical documentation provides compression rate comparisons (3.13 characters/token) and confirms alignment with the model's multilingual capabilities. The tokenizer code is accessible via official GitHub repositories for the Hunyuan family, allowing for verification of its implementation.
Parameter Density
Tencent is transparent about the model's sparse architecture, disclosing a total parameter count of approximately 389 billion with 52 billion active parameters per token. The use of 16 specialized experts and 1 shared expert is clearly documented. However, the '70B' designation in some marketing materials can be slightly confusing given the 52B active parameter reality, though technical reports clarify this distinction.
Training Compute
Information is sparse regarding the total compute budget. While Tencent discloses that 96.7% of post-training compute was dedicated to reinforcement learning, it does not provide the absolute number of GPU/TPU hours, specific hardware cluster sizes, or the total carbon footprint. This lack of absolute metrics makes it impossible to independently verify the environmental impact or total resource investment.
Benchmark Reproducibility
Tencent provides scores for standard benchmarks (MMLU-Pro: 87.2, MATH-500: 96.2, GPQA-Diamond: 69.3) and has released 'AutoCodeBench' to the community. However, the exact prompts and full evaluation code for the T1 reasoning chains are not fully public, and some results rely on internal human evaluation datasets that cannot be independently verified.
Identity Consistency
The model demonstrates strong identity consistency, correctly identifying itself as part of the Hunyuan T1 family and distinguishing its 'slow-thinking' reasoning capabilities from the 'fast-thinking' TurboS base. There are no documented instances of the model claiming a competitor's identity or misrepresenting its origin during standard interactions.
License Clarity
The model is governed by the 'Tencent Hunyuan Community License Agreement,' which is a custom license rather than a standard open-source license like Apache 2.0. It includes significant geographic restrictions (excluding the UK, EU, and South Korea) and specific terms regarding 'Model Derivatives' and commercial use. The lack of a standard OSI-approved license creates ambiguity for global developers.
Hardware Footprint
Tencent provides general guidance on VRAM requirements, noting that the model is optimized for efficiency and 2x faster decoding compared to dense transformers. Third-party documentation and community guides provide VRAM estimates for various quantization levels (e.g., FP8, INT4), but official, comprehensive hardware requirement tables for the full T1 reasoning deployment are not centrally documented in a single technical model card.
Versioning Drift
Tencent maintains a versioning history (e.g., T1-Preview to T1 Official), but the changelogs are primarily high-level marketing summaries rather than detailed technical diffs. There is no public mechanism to pin specific sub-versions or track subtle behavioral drift resulting from the continuous reinforcement learning updates mentioned in official communications.
APX AI
Online