ApX 标志ApX 标志

趋近智

Kimi-Dev-72B

参数

72B

上下文长度

131.072K

模态

Text

架构

Dense

许可证

MIT License

发布日期

16 Jun 2025

训练数据截止日期

-

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

-

层数

-

注意力头

-

键值头

-

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Kimi-Dev-72B

Kimi-Dev-72B is a specialized large language model developed by Moonshot AI, engineered specifically for autonomous software engineering and complex issue resolution. Built upon the Qwen2.5-72B foundational architecture, the model undergoes a sophisticated multi-stage training process designed to instill structured skill priors for software development tasks. This process includes a large-scale mid-training phase using approximately 150 billion tokens of high-quality, real-world data from GitHub issues and pull request commits, enabling the model to internalize the reasoning patterns and technical workflows employed by human developers. Unlike general-purpose coding assistants, Kimi-Dev-72B is optimized to function as an autonomous agent capable of localized file identification and precise code editing.

The model's core innovation lies in its duo-stage framework, comprising specialized "BugFixer" and "TestWriter" behaviors. This architecture facilitates a two-step operational cycle: first, the model identifies the relevant files within a repository (File Localization), and second, it generates the necessary code modifications or unit tests (Code Edits). The training methodology leverages large-scale reinforcement learning (RL) with outcome-based rewards, where the model receives positive reinforcement only when its proposed patches successfully pass an entire test suite within a containerized Docker environment. This rigorous verification loop ensures that the generated solutions are functionally correct and adhere to production-grade standards.

Kimi-Dev-72B is designed for seamless integration into modern software development lifecycles, supporting tasks such as automated bug fixing, unit test generation, and comprehensive code reviews. By employing a test-time self-play mechanism, the model iteratively refines its outputs, making it highly effective for resolving complex issues in large-scale codebases. Its dense 72-billion-parameter architecture provides a robust balance between reasoning capability and computational efficiency, while its 131,072-token context window allows it to maintain a deep understanding of extensive project structures and cross-file dependencies. The model is released under the MIT license, providing the community with open access to its weights and source code for further research and development.

关于 Kimi

Moonshot AI's Kimi model family, exemplified by Kimi K2, employs a Mixture-of-Experts architecture with one trillion total parameters. Designed for natural language generation and agentic capabilities, it features a 128K token context window. The models are open-weight and optimized with the Muon optimizer for stable training.


其他 Kimi 模型
  • 没有相关模型

评估基准

没有可用的 Kimi-Dev-72B 评估基准。

排名

排名

-

编程排名

-

模型透明度

总分

B

67 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
64k
128k

所需显存:

推荐 GPU