ApX logoApX logo

Kimi-Dev-72B

Parameters

72B

Context Length

131.072K

Modality

Text

Architecture

Dense

License

MIT License

Release Date

16 Jun 2025

Knowledge Cutoff

-

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

-

Number of Layers

-

Attention Heads

-

Key-Value Heads

-

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

Kimi-Dev-72B

Kimi-Dev-72B is a specialized large language model developed by Moonshot AI, engineered specifically for autonomous software engineering and complex issue resolution. Built upon the Qwen2.5-72B foundational architecture, the model undergoes a sophisticated multi-stage training process designed to instill structured skill priors for software development tasks. This process includes a large-scale mid-training phase using approximately 150 billion tokens of high-quality, real-world data from GitHub issues and pull request commits, enabling the model to internalize the reasoning patterns and technical workflows employed by human developers. Unlike general-purpose coding assistants, Kimi-Dev-72B is optimized to function as an autonomous agent capable of localized file identification and precise code editing.

The model's core innovation lies in its duo-stage framework, comprising specialized "BugFixer" and "TestWriter" behaviors. This architecture facilitates a two-step operational cycle: first, the model identifies the relevant files within a repository (File Localization), and second, it generates the necessary code modifications or unit tests (Code Edits). The training methodology leverages large-scale reinforcement learning (RL) with outcome-based rewards, where the model receives positive reinforcement only when its proposed patches successfully pass an entire test suite within a containerized Docker environment. This rigorous verification loop ensures that the generated solutions are functionally correct and adhere to production-grade standards.

Kimi-Dev-72B is designed for seamless integration into modern software development lifecycles, supporting tasks such as automated bug fixing, unit test generation, and comprehensive code reviews. By employing a test-time self-play mechanism, the model iteratively refines its outputs, making it highly effective for resolving complex issues in large-scale codebases. Its dense 72-billion-parameter architecture provides a robust balance between reasoning capability and computational efficiency, while its 131,072-token context window allows it to maintain a deep understanding of extensive project structures and cross-file dependencies. The model is released under the MIT license, providing the community with open access to its weights and source code for further research and development.

About Kimi

Moonshot AI's Kimi model family, exemplified by Kimi K2, employs a Mixture-of-Experts architecture with one trillion total parameters. Designed for natural language generation and agentic capabilities, it features a 128K token context window. The models are open-weight and optimized with the Muon optimizer for stable training.


Other Kimi Models
  • No related models available

Evaluation Benchmarks

No evaluation benchmarks for Kimi-Dev-72B available.

Rankings

Overall Rank

-

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
64k
128k

VRAM Required:

Recommended GPUs

Kimi-Dev-72B: Specifications and GPU VRAM Requirements