趋近智
活跃参数
106B
上下文长度
128K
模态
Multimodal
架构
Mixture of Experts (MoE)
许可证
MIT License
发布日期
28 Jul 2025
知识截止
-
专家参数总数
12.0B
专家数量
-
活跃专家
-
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
96
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
不同量化方法和上下文大小的显存要求
The GLM-4.5-Air model, developed by Z.ai, is a member of the GLM-4.5 series, designed as a lightweight and efficient large language model. This variant is specifically optimized for on-device and smaller-scale cloud inference, aiming to deliver robust capabilities while minimizing hardware and computational requirements. It integrates core functionalities such as reasoning, coding, and agentic behaviors, making it suitable for a range of advanced AI applications.
Architecturally, GLM-4.5-Air leverages a Mixture-of-Experts (MoE) design. This allows the model to selectively activate a subset of its parameters during inference, enhancing computational efficiency compared to dense architectures. While the full GLM-4.5 model employs 355 billion total parameters with 32 billion active, GLM-4.5-Air features 106 billion total parameters with 12 billion active parameters. The model also incorporates a Multi-Token Prediction (MTP) layer to facilitate speculative decoding, which significantly boosts inference speed, potentially achieving generation rates of over 100 tokens per second.
GLM-4.5-Air supports a hybrid reasoning approach, offering both a 'thinking mode' for intricate, multi-step problem-solving and a 'non-thinking mode' for immediate, rapid responses. This dual-mode operation allows for dynamic adaptation to query complexity, optimizing resource utilization. The model is also engineered for advanced agentic applications, including native function calling, tool use, web browsing, and comprehensive software development tasks, such as full-stack web application creation.
General Language Models from Z.ai
排名适用于本地LLM。
排名
#7
基准 | 分数 | 排名 |
---|---|---|
Web Development WebDev Arena | 1353.76 | 🥉 3 |
Reasoning LiveBench Reasoning | 0.78 | ⭐ 4 |
Agentic Coding LiveBench Agentic | 0.15 | 5 |
Mathematics LiveBench Mathematics | 0.79 | 5 |
Data Analysis LiveBench Data Analysis | 0.66 | 7 |
Coding LiveBench Coding | 0.58 | 13 |