GLM-4.6：规格和 GPU 显存要求

GLM-4.6

开源

开放权重

活跃参数

357B

上下文长度

200K

模态

Text

架构

Mixture of Experts (MoE)

许可证

MIT

发布日期

30 Sept 2025

训练数据截止日期

技术规格

专家参数总数

32.0B

专家数量

活跃专家

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

注意力头

键值头

激活函数

归一化

位置嵌入

Absolute Position Embedding

GLM-4.6

GLM-4.6 is a large language model developed by Z.ai, designed to facilitate advanced applications in artificial intelligence. This model is engineered to operate efficiently across a spectrum of complex tasks, including sophisticated coding, extended context processing, and agentic operations. Its bilingual capabilities, supporting both English and Chinese, extend its applicability across diverse linguistic contexts. The model’s purpose is to serve as a robust foundation for building intelligent systems capable of nuanced reasoning and autonomous interaction.

Architecturally, GLM-4.6 implements a Mixture-of-Experts (MoE) configuration, incorporating 357 billion total parameters, with 32 billion parameters actively utilized during a given forward pass. The model's design features a context window expanded to 200,000 tokens, enabling it to process and maintain coherence over substantial input sequences. Innovations within its attention mechanism include Grouped-Query Attention (GQA) with 96 attention heads, and the integration of a partial Rotary Position Embedding (RoPE) for positional encoding. Normalization is managed through QK-Norm, contributing to stabilized attention logits. These architectural choices aim to balance computational efficiency with enhanced performance in complex cognitive operations.

The operational characteristics of GLM-4.6 are optimized for real-world development workflows. It demonstrates superior coding performance, leading to more visually polished front-end generation and improved real-world application results. The model exhibits enhanced reasoning capabilities, which are further augmented by its integrated tool-use functionality during inference. This facilitates the creation of more capable agents proficient in search-based tasks and role-playing scenarios. Furthermore, GLM-4.6 achieves improved token efficiency, completing tasks with approximately 15% fewer tokens compared to its predecessor, GLM-4.5, thereby offering a more cost-effective inference profile.

关于 GLM-4

GLM-4 is a series of bilingual (English and Chinese) language models developed by Zhipu AI. The models feature extended context windows, superior coding performance, advanced reasoning capabilities, and strong agent functionalities. GLM-4.6 offers improvements in tool use and search-based agents.

其他 GLM-4 模型

GLM-4.7

评估基准

排名

#36

基准	分数	排名
Graduate-Level QA GPQA	0.81	13
Mathematics LiveBench Mathematics	0.81	14
Data Analysis LiveBench Data Analysis	0.72	16
Reasoning LiveBench Reasoning	0.62	20
Agentic Coding LiveBench Agentic	0.35	24
Coding LiveBench Coding	0.71	25

排名

#36

编程排名

#52

模型透明度

总分

66 / 100

上游

18.5 / 30

模型

25.5 / 40

下游

22.0 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

98k

195k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

GLM-4.6

技术规格

GLM-4.6

关于 GLM-4

其他 GLM-4 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源