ApX 标志

趋近智

GLM-4.6

活跃参数

357B

上下文长度

200K

模态

Text

架构

Mixture of Experts (MoE)

许可证

MIT

发布日期

30 Sept 2025

训练数据截止日期

-

技术规格

专家参数总数

32.0B

专家数量

-

活跃专家

-

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

-

注意力头

96

键值头

-

激活函数

-

归一化

-

位置嵌入

Absolute Position Embedding

系统要求

不同量化方法和上下文大小的显存要求

GLM-4.6

GLM-4.6 is a large language model developed by Z.ai, designed to facilitate advanced applications in artificial intelligence. This model is engineered to operate efficiently across a spectrum of complex tasks, including sophisticated coding, extended context processing, and agentic operations. Its bilingual capabilities, supporting both English and Chinese, extend its applicability across diverse linguistic contexts. The model’s purpose is to serve as a robust foundation for building intelligent systems capable of nuanced reasoning and autonomous interaction.

Architecturally, GLM-4.6 implements a Mixture-of-Experts (MoE) configuration, incorporating 357 billion total parameters, with 32 billion parameters actively utilized during a given forward pass. The model's design features a context window expanded to 200,000 tokens, enabling it to process and maintain coherence over substantial input sequences. Innovations within its attention mechanism include Grouped-Query Attention (GQA) with 96 attention heads, and the integration of a partial Rotary Position Embedding (RoPE) for positional encoding. Normalization is managed through QK-Norm, contributing to stabilized attention logits. These architectural choices aim to balance computational efficiency with enhanced performance in complex cognitive operations.

The operational characteristics of GLM-4.6 are optimized for real-world development workflows. It demonstrates superior coding performance, leading to more visually polished front-end generation and improved real-world application results. The model exhibits enhanced reasoning capabilities, which are further augmented by its integrated tool-use functionality during inference. This facilitates the creation of more capable agents proficient in search-based tasks and role-playing scenarios. Furthermore, GLM-4.6 achieves improved token efficiency, completing tasks with approximately 15% fewer tokens compared to its predecessor, GLM-4.5, thereby offering a more cost-effective inference profile.

关于 GLM-4

GLM-4 is a series of bilingual (English and Chinese) language models developed by Zhipu AI. The models feature extended context windows, superior coding performance, advanced reasoning capabilities, and strong agent functionalities. GLM-4.6 offers improvements in tool use and search-based agents.


其他 GLM-4 模型
  • 没有相关模型

评估基准

排名适用于本地LLM。

没有可用的 GLM-4.6 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
98k
195k

所需显存:

推荐 GPU