ApX 标志

趋近智

Kimi-Dev-72B

参数

72B

上下文长度

131.072K

模态

Text

架构

Dense

许可证

MIT License

发布日期

16 Jun 2025

知识截止

-

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

-

层数

-

注意力头

-

键值头

-

激活函数

SwigLU

归一化

-

位置嵌入

Absolute Position Embedding

系统要求

不同量化方法和上下文大小的显存要求

Kimi-Dev-72B

Kimi-Dev-72B is a specialized large language model developed by Moonshot AI, explicitly designed for advanced software engineering tasks. This 72-billion-parameter model focuses on automating and assisting in the software development lifecycle, encompassing capabilities such as bug fixing, code generation, and the creation of unit tests. Its primary objective is to enhance developer productivity by streamlining repetitive coding tasks and improving the reliability of generated code. The model accepts natural language prompts and coding-related queries through a standard chat interface.

The model's architecture is transformer-based, building upon the Qwen 2.5-72B foundational model. Its optimization involves large-scale reinforcement learning (RL), employing a dataset of approximately 150 billion tokens derived from high-quality, real-world data, including GitHub issues and pull request commits. A notable innovation in its design is the "BugFixer" and "TestWriter" duo, which facilitates a two-stage process: initial file localization followed by precise code editing. The training methodology emphasizes outcome-based rewards, where the model is rewarded only upon successful resolution of issues that pass comprehensive test suites within Docker environments. This approach ensures the generation of robust and verifiable solutions. Furthermore, Kimi-Dev-72B incorporates a test-time self-play mechanism to iteratively refine its outputs.

Kimi-Dev-72B demonstrates proficiency in autonomous code patching within Docker environments, verifying solutions against complete test suites. This characteristic makes it suitable for integration into continuous integration and continuous deployment (CI/CD) pipelines and other production-oriented development workflows. Its use cases extend to automated code review, the implementation of new features, and the generation of technical documentation. The model is capable of producing well-structured, functional code that adheres to established best practices, including the inclusion of type hints and docstrings. Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub.

关于 Kimi

Moonshot AI's Kimi model family, exemplified by Kimi K2, employs a Mixture-of-Experts architecture with one trillion total parameters. Designed for natural language generation and agentic capabilities, it features a 128K token context window. The models are open-weight and optimized with the Muon optimizer for stable training.


其他 Kimi 模型
  • 没有相关模型

评估基准

排名适用于本地LLM。

没有可用的 Kimi-Dev-72B 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
64k
128k

所需显存:

推荐 GPU