Active Parameters
358B
Context Length
200K
Modality
Text
Architecture
Mixture of Experts (MoE)
License
MIT
Release Date
8 Jan 2026
Knowledge Cutoff
Sep 2024
Total Expert Parameters
-
Number of Experts
-
Active Experts
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
-
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
GLM-4.7 is a substantial bilingual Mixture of Experts (MoE) model engineered by Z.ai, designed for advanced agentic coding and complex reasoning tasks. It represents an iteration in the GLM-4 series, building upon its predecessors to enhance capabilities in multi-language programming and terminal-based workflows. The model incorporates a sophisticated three-tier thinking architecture: Interleaved Thinking, which involves reasoning prior to each response and tool invocation to refine instruction adherence and generation quality; Preserved Thinking, which maintains reasoning patterns across multi-turn conversations to support long-horizon tasks by minimizing information decay; and Turn-level Thinking, providing granular control over reasoning depth per interaction to balance latency and computational cost.
This architecture is tailored to facilitate superior performance in agent-based applications, enabling more stable and controllable execution of complex operations. The model is equipped to handle diverse programming challenges, including those requiring agentic workflows across multiple files and turns. It aims to generate more natural conversational outputs and enhance the aesthetic quality of front-end and user interface code, delivering cleaner, more modern web pages and improved presentation layouts.
GLM-4.7 also demonstrates advancements in tool integration, allowing for robust interaction with external toolsets. Its capabilities extend to intricate reasoning, including mathematical problem-solving and general analytical tasks. The model's design emphasizes adaptability and efficiency for a spectrum of development and automation scenarios.
GLM-4 is a series of bilingual (English and Chinese) language models developed by Zhipu AI. The models feature extended context windows, superior coding performance, advanced reasoning capabilities, and strong agent functionalities. GLM-4.6 offers improvements in tool use and search-based agents.
No evaluation benchmarks for GLM-4.7 available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens