趋近智
参数
32B
上下文长度
128K
模态
Text
架构
Dense
许可证
Custom Commercial License with Restrictions
发布日期
15 Jan 2024
训练数据截止日期
Dec 2023
注意力结构
Multi-Head Attention
隐藏维度大小
6144
层数
61
注意力头
48
键值头
2
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
The GLM-4 32B model is a foundational large language model developed by Z.ai, representing a significant scaling of the General Language Model (GLM) architecture to 32 billion parameters. This model is engineered to balance high-order reasoning capabilities with computational efficiency, serving as a versatile core for advanced agentic applications, complex code generation, and intricate bilingual text processing. It occupies a strategic position within the GLM-4 family, providing the structural complexity necessary for sophisticated linguistic understanding while maintaining a footprint suitable for diverse deployment environments.
Technically, the model utilizes a dense transformer architecture optimized through extensive pre-training on a massive corpus of 15 trillion tokens. This training set includes a substantial proportion of synthetic reasoning data, specifically curated to enhance the model's logical inference and problem-solving skills. The architectural design integrates modern advancements such as Rotary Positional Embeddings (RoPE) and Group Query Attention (GQA), which together facilitate stable performance and efficient inference over a context window of up to 128,000 tokens. To ensure high-quality output, the model undergoes a multi-stage post-training pipeline involving human preference alignment, rejection sampling, and reinforcement learning.
GLM-4 32B is specifically optimized for scenarios requiring structured outputs and autonomous tool interaction. Its performance characteristics make it particularly effective for engineering-grade code generation, precise search-based question answering, and the creation of detailed technical artifacts. The model's refined instruction-following and robust function-calling capabilities enable it to act as the primary engine for intelligent agents that need to plan and execute multi-step tasks across diverse software environments and knowledge domains.
General Language Models from Z.ai
没有可用的 GLM-4 评估基准。