ApX logoApX logo

GLM-5

Active Parameters

744B

Context Length

204.8K

Modality

Multimodal

Architecture

Mixture of Experts (MoE)

License

MIT

Release Date

12 Feb 2026

Knowledge Cutoff

-

Technical Specifications

Total Expert Parameters

40.0B

Number of Experts

-

Active Experts

-

Attention Structure

Multi-Head Attention

Hidden Dimension Size

-

Number of Layers

-

Attention Heads

-

Key-Value Heads

-

Activation Function

-

Normalization

-

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

GLM-5

GLM-5 is a state-of-the-art multimodal model from Z.ai (Zhipu AI) released in February 2026. Built on a 744 billion parameter Mixture-of-Experts (MoE) architecture with 40 billion active parameters, it integrates DeepSeek Sparse Attention (DSA) to deliver massive intelligence efficiency while maintaining long-context capacity (200K+ tokens). Released under the MIT License, GLM-5 targets complex systems engineering and long-horizon agentic tasks, achieving top-tier performance on SWE-bench and Vending Bench 2.

About GLM 5

GLM 5 is the fifth generation of General Language Models developed by Z.ai. It represents a significant leap in multimodal foundational capabilities, featuring advanced reasoning and long-horizon agentic capabilities across diverse systems engineering tasks.


Other GLM 5 Models
  • No related models available

Evaluation Benchmarks

No evaluation benchmarks for GLM-5 available.

Rankings

Overall Rank

-

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
100k
200k

VRAM Required:

Recommended GPUs