Active Parameters
355B
Context Length
128K
Modality
Multimodal
Architecture
Mixture of Experts (MoE)
License
MIT License
Release Date
28 Jul 2025
Knowledge Cutoff
Jan 2025
Total Expert Parameters
32.0B
Number of Experts
-
Active Experts
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
5120
Number of Layers
96
Attention Heads
96
Key-Value Heads
-
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
GLM-4.5 is a flagship multimodal large language model developed by Z.ai that integrates complex reasoning, software engineering, and agentic capabilities within a unified architecture. It employs a sophisticated Mixture-of-Experts (MoE) design with 355 billion total parameters, specifically engineered to optimize parameter efficiency by activating only 32 billion parameters during a forward pass. A defining feature of the model is its dual-mode execution framework, which allows it to alternate between a high-latency 'Thinking Mode' for multi-step planning and an instantaneous 'Non-Thinking Mode' for standard interactive tasks.
Technical innovations in GLM-4.5 focus on architectural depth over width to enhance logical deduction and mathematical processing. The model utilizes Grouped-Query Attention (GQA) with 96 attention heads and a hidden dimension size of 5120. Its MoE implementation features sigmoid-gated routing and QK-Norm to ensure stable expert utilization and load balancing. The training pipeline involved a massive 23-trillion-token corpus, including 7 trillion tokens dedicated to code and reasoning datasets, followed by reinforcement learning using the custom-built 'slime' infrastructure to refine autonomous decision-making.
Designed for production-grade agent applications, GLM-4.5 supports native function calling and complex web browsing with a high success rate. It features an expansive 128,000-token context window and a substantial maximum output limit of 96,000 tokens, making it suitable for long-form document analysis and full-stack software development. The model is released with open weights under the MIT License, facilitating broad adoption in both research and commercial environments.
General Language Models from Z.ai
Rank
#32
| Benchmark | Score | Rank |
|---|---|---|
Professional Knowledge MMLU Pro | 0.85 | 🥇 1 |
Web Development WebDev Arena | 1410 | 16 |
Graduate-Level QA GPQA | 0.79 | 20 |
Overall Rank
#32
Coding Rank
#25
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens