ApX 标志ApX 标志

趋近智

ChatGLM3-6B

参数

6B

上下文长度

8.192K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

27 Oct 2023

训练数据截止日期

Jul 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

28

注意力头

32

键值头

2

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

ChatGLM3-6B

ChatGLM3-6B is an advanced bilingual (Chinese-English) large language model developed through a collaboration between Zhipu AI and the Knowledge Engineering Group at Tsinghua University. As the third generation in the ChatGLM series, this model implements a refined General Language Model architecture that bridges the functional divide between autoencoding and autoregressive objectives. The pre-training phase utilizes a diverse corpus comprising approximately one trillion tokens, optimized for conversational coherence and instruction following across multiple domains including mathematics, programming, and logical reasoning.

Technically, the model is built on a dense Transformer-based architecture featuring Multi-Head Attention and RoPE (Rotary Positional Embeddings) for efficient sequence handling. A significant advancement in the ChatGLM3 iteration is its native support for complex agent-centric workflows, including function calling and code execution via an integrated interpreter. This functionality is supported by a redesigned prompt format that facilitates structured interactions and multi-turn dialogue management, making it suitable for deployment in scenarios requiring autonomous task execution.

Designed for local and edge deployment, ChatGLM3-6B maintains a low computational footprint while delivering enhanced performance relative to its predecessors. It utilizes SwiGLU activation functions and RMSNorm for stable training, with a vocabulary expanded to support efficient bilingual tokenization. The model's versatility is demonstrated through its ability to handle a variety of downstream applications, from standard question-answering to sophisticated agentic behaviors, all while operating within a context window optimized for standard conversational tasks.

关于 ChatGLM

ChatGLM series models from Z.ai, based on GLM architecture.


其他 ChatGLM 模型

评估基准

排名

#102

基准分数排名

Web Development

WebDev Arena

1056

63

排名

排名

#102

编程排名

#93

模型透明度

总分

B

64 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
4k
8k

所需显存:

推荐 GPU

ChatGLM3-6B:规格和 GPU 显存要求