Parameters
32B
Context Length
128K
Modality
Text
Architecture
Dense
License
Custom Commercial License with Restrictions
Release Date
15 Jan 2024
Knowledge Cutoff
Dec 2023
Attention Structure
Multi-Head Attention
Hidden Dimension Size
6144
Number of Layers
61
Attention Heads
48
Key-Value Heads
2
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
The GLM-4 32B model is a foundational large language model developed by Z.ai, representing a significant scaling of the General Language Model (GLM) architecture to 32 billion parameters. This model is engineered to balance high-order reasoning capabilities with computational efficiency, serving as a versatile core for advanced agentic applications, complex code generation, and intricate bilingual text processing. It occupies a strategic position within the GLM-4 family, providing the structural complexity necessary for sophisticated linguistic understanding while maintaining a footprint suitable for diverse deployment environments.
Technically, the model utilizes a dense transformer architecture optimized through extensive pre-training on a massive corpus of 15 trillion tokens. This training set includes a substantial proportion of synthetic reasoning data, specifically curated to enhance the model's logical inference and problem-solving skills. The architectural design integrates modern advancements such as Rotary Positional Embeddings (RoPE) and Group Query Attention (GQA), which together facilitate stable performance and efficient inference over a context window of up to 128,000 tokens. To ensure high-quality output, the model undergoes a multi-stage post-training pipeline involving human preference alignment, rejection sampling, and reinforcement learning.
GLM-4 32B is specifically optimized for scenarios requiring structured outputs and autonomous tool interaction. Its performance characteristics make it particularly effective for engineering-grade code generation, precise search-based question answering, and the creation of detailed technical artifacts. The model's refined instruction-following and robust function-calling capabilities enable it to act as the primary engine for intelligent agents that need to plan and execute multi-step tasks across diverse software environments and knowledge domains.
General Language Models from Z.ai
No evaluation benchmarks for GLM-4 available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens