Kimi K2-Instruct: Specifications and GPU VRAM Requirements

Kimi K2-Instruct

开源

开放权重

活跃参数

上下文长度

128K

模态

Text

架构

Mixture of Experts (MoE)

许可证

Modified MIT License

发布日期

11 Jul 2025

知识截止

技术规格

专家参数总数

32.0B

专家数量

384

活跃专家

注意力结构

Multi-Layer Attention

隐藏维度大小

7168

层数

注意力头

键值头

激活函数

SwigLU

归一化

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Kimi K2-Instruct

Kimi K2-Instruct is an advanced Mixture-of-Experts (MoE) language model developed by Moonshot AI. This model incorporates 1 trillion total parameters, with approximately 32 billion parameters activated during each inference pass. Its core purpose is to deliver state-of-the-art agentic intelligence, facilitating sophisticated tool utilization, advanced code generation, and autonomous problem-solving across various domains. As a post-trained instruction-following variant, Kimi K2-Instruct is optimized for general-purpose conversational tasks and complex agentic workflows, operating as a reflex-grade model designed for direct application.

The architectural design of Kimi K2-Instruct features a Mixture-of-Experts paradigm, leveraging 384 specialized experts, with 8 active experts dynamically selected per token during inference. The model comprises 61 layers and employs a Multi-head Local Attention (MLA) mechanism with 64 attention heads. A key innovation in its training methodology is the MuonClip optimizer, developed by Moonshot AI, which ensures training stability at the expansive scale of 15.5 trillion tokens. The architecture prioritizes long-context efficiency, supporting a substantial context window of 128,000 tokens. The activation function employed within the model is SwiGLU, complemented by Rotary Position Embeddings (RoPE).

Kimi K2-Instruct is engineered for demanding applications, including complex, multi-step reasoning tasks and analytical workflows that necessitate profound comprehension. Its capabilities encompass advanced code generation, ranging from foundational scripting to intricate software development and debugging, along with robust support for multilingual applications. The model exhibits strong tool-calling capabilities, enabling it to autonomously interpret user intentions and orchestrate external tools and APIs to accomplish intricate objectives. Practical use cases include automating development workflows, generating comprehensive data analysis reports, and facilitating interactive task planning by seamlessly integrating multiple external services.

关于 Kimi K2

Moonshot AI's Kimi K2 is a Mixture-of-Experts model featuring one trillion total parameters, activating 32 billion per token. Designed for agentic intelligence, it utilizes a sparse architecture with 384 experts and the MuonClip optimizer for training stability, supporting a 128K token context window.

其他 Kimi K2 模型

评估基准

排名适用于本地LLM。

排名

基准	分数	排名
QA Assistant ProLLM QA Assistant	0.98	🥇 1
Coding LiveBench Coding	0.72	🥉 3
Graduate-Level QA GPQA	0.75	🥉 3
Agentic Coding LiveBench Agentic	0.20	4
Professional Knowledge MMLU Pro	0.81	⭐ 4
General Knowledge MMLU	0.75	7
Mathematics LiveBench Mathematics	0.74	8
Reasoning LiveBench Reasoning	0.63	10
Data Analysis LiveBench Data Analysis	0.63	11

排名

#2 🥈

编程排名

#12

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

63k

125k

所需显存:

资源

官方文档发布说明下载权重源代码

Kimi K2-Instruct

技术规格

系统要求

Kimi K2-Instruct

关于 Kimi K2

其他 Kimi K2 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源