Kimi-Dev-72B: Specifications and GPU VRAM Requirements

Kimi-Dev-72B

开源

开放权重

参数

72B

上下文长度

131.072K

模态

Text

架构

Dense

许可证

MIT License

发布日期

16 Jun 2025

训练数据截止日期

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

层数

注意力头

键值头

激活函数

SwigLU

归一化

位置嵌入

Absolute Position Embedding

系统要求

不同量化方法和上下文大小的显存要求

Kimi-Dev-72B

Kimi-Dev-72B is a specialized large language model developed by Moonshot AI, explicitly designed for advanced software engineering tasks. This 72-billion-parameter model focuses on automating and assisting in the software development lifecycle, encompassing capabilities such as bug fixing, code generation, and the creation of unit tests. Its primary objective is to enhance developer productivity by streamlining repetitive coding tasks and improving the reliability of generated code. The model accepts natural language prompts and coding-related queries through a standard chat interface.

The model's architecture is transformer-based, building upon the Qwen 2.5-72B foundational model. Its optimization involves large-scale reinforcement learning (RL), employing a dataset of approximately 150 billion tokens derived from high-quality, real-world data, including GitHub issues and pull request commits. A notable innovation in its design is the "BugFixer" and "TestWriter" duo, which facilitates a two-stage process: initial file localization followed by precise code editing. The training methodology emphasizes outcome-based rewards, where the model is rewarded only upon successful resolution of issues that pass comprehensive test suites within Docker environments. This approach ensures the generation of robust and verifiable solutions. Furthermore, Kimi-Dev-72B incorporates a test-time self-play mechanism to iteratively refine its outputs.

Kimi-Dev-72B demonstrates proficiency in autonomous code patching within Docker environments, verifying solutions against complete test suites. This characteristic makes it suitable for integration into continuous integration and continuous deployment (CI/CD) pipelines and other production-oriented development workflows. Its use cases extend to automated code review, the implementation of new features, and the generation of technical documentation. The model is capable of producing well-structured, functional code that adheres to established best practices, including the inclusion of type hints and docstrings. Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub.

关于 Kimi

Moonshot AI's Kimi model family, exemplified by Kimi K2, employs a Mixture-of-Experts architecture with one trillion total parameters. Designed for natural language generation and agentic capabilities, it features a 128K token context window. The models are open-weight and optimized with the Muon optimizer for stable training.

其他 Kimi 模型

没有相关模型

评估基准

排名适用于本地LLM。

没有可用的 Kimi-Dev-72B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

64k

128k

所需显存:

资源

官方文档发布说明下载权重源代码

Kimi-Dev-72B

技术规格

系统要求

Kimi-Dev-72B

关于 Kimi

其他 Kimi 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源