Kimi K2.5：规格和 GPU 显存要求

Kimi K2.5

开源

开放权重

活跃参数

上下文长度

512K

模态

Text

架构

Mixture of Experts (MoE)

许可证

Modified MIT License

发布日期

5 Feb 2026

训练数据截止日期

Oct 2025

技术规格

专家参数总数

968.0B

专家数量

384

活跃专家

注意力结构

Multi-Head Attention

隐藏维度大小

7168

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Kimi K2.5

Kimi K2.5 is a high-capacity Mixture-of-Experts (MoE) large language model developed by Moonshot AI, designed to address complex reasoning and multimodal tasks at scale. The model is built on a massive 1-trillion parameter architecture that employs a sparse activation strategy, utilizing only 32 billion active parameters per forward pass to maintain computational efficiency while providing deep representational capacity. It distinguishes itself through its native multimodal training, where vision and language components are co-trained from the initial pre-training phase on approximately 15 trillion tokens, enabling unified processing of visual data and textual information.

Technically, Kimi K2.5 integrates several architectural innovations, most notably the use of Multi-head Latent Attention (MLA) and a specialized 384-expert MoE structure. The attention mechanism is optimized for high-throughput inference and long-context performance, supporting context windows up to 256,000 tokens. The model also introduces an 'Agent Swarm' paradigm, a self-directed multi-agent orchestration system trained via Parallel Agent Reinforcement Learning (PARL). This allows the model to decompose complex objectives into independent sub-tasks executed by up to 100 parallel sub-agents, significantly reducing serial execution latency in tool-heavy workflows.

In practical application, Kimi K2.5 functions as a versatile engine for advanced coding, document synthesis, and automated reasoning. It features four distinct operational modes, Instant, Thinking, Agent, and Agent Swarm, allowing users to balance response speed and reasoning depth based on the task requirement. Its native visual coding capabilities allow for the direct translation of UI designs and video workflows into functional code, while its extensive context window facilitates the analysis of large codebases and complex technical documentation. The model's training stability at the trillion-parameter scale is achieved through the MuonClip optimizer, which mitigates common loss spikes associated with sparse architectures.

关于 Kimi K2

Moonshot AI's Kimi K2 is a Mixture-of-Experts model featuring one trillion total parameters, activating 32 billion per token. Designed for agentic intelligence, it utilizes a sparse architecture with 384 experts and the MuonClip optimizer for training stability, supporting a 128K token context window.

其他 Kimi K2 模型

评估基准

排名

基准	分数	排名
Mathematics LiveBench Mathematics	0.85	7
Reasoning LiveBench Reasoning	0.76	11

排名

#2 🥈

编程排名

模型透明度

总分

69 / 100

上游

22.0 / 30

模型

26.5 / 40

下游

20.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

250k

500k

所需显存:

资源

官方文档发布说明下载权重源代码

Kimi K2.5

技术规格

Kimi K2.5

关于 Kimi K2

其他 Kimi K2 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源