Active Parameters
744B
Context Length
204.8K
Modality
Multimodal
Architecture
Mixture of Experts (MoE)
License
MIT
Release Date
12 Feb 2026
Knowledge Cutoff
-
Total Expert Parameters
40.0B
Number of Experts
-
Active Experts
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
-
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
GLM-5 is a state-of-the-art multimodal model from Z.ai (Zhipu AI) released in February 2026. Built on a 744 billion parameter Mixture-of-Experts (MoE) architecture with 40 billion active parameters, it integrates DeepSeek Sparse Attention (DSA) to deliver massive intelligence efficiency while maintaining long-context capacity (200K+ tokens). Released under the MIT License, GLM-5 targets complex systems engineering and long-horizon agentic tasks, achieving top-tier performance on SWE-bench and Vending Bench 2.
GLM 5 is the fifth generation of General Language Models developed by Z.ai. It represents a significant leap in multimodal foundational capabilities, featuring advanced reasoning and long-horizon agentic capabilities across diverse systems engineering tasks.
No evaluation benchmarks for GLM-5 available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens