趋近智
参数
6B
上下文长度
32.768K
模态
Text
架构
Dense
许可证
Custom License (ChatGLM2-6B License)
发布日期
25 Jun 2023
训练数据截止日期
-
注意力结构
Multi-Head Attention
隐藏维度大小
4096
层数
28
注意力头
32
键值头
2
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
ChatGLM2-6B is a bilingual large language model designed to facilitate conversational interactions in both Chinese and English. As the second iteration in the ChatGLM series developed by THUDM, it is built upon the General Language Model (GLM) framework and serves as a versatile tool for dialogue generation and cross-lingual text processing. The model is optimized for execution on consumer-grade hardware through efficient architectural choices, enabling a high degree of accessibility for developers and researchers working within hardware-constrained environments.
The architecture utilizes a dense transformer structure that incorporates several technical advancements over its predecessor. A key innovation is the adoption of Multi-Query Attention (MQA), which streamlines inference by sharing key and value heads across multiple query heads, significantly reducing the memory footprint of the KV cache. Furthermore, the model integrates Rotary Position Embeddings (RoPE) to capture token relationships and utilizes RMSNorm for improved training stability. The inclusion of FlashAttention during the pre-training phase allows the architecture to support a substantial context window, facilitating the processing of extended dialogue histories.
Operating with 6 billion parameters, ChatGLM2-6B provides a balanced profile of performance and efficiency. It was pre-trained on a diverse dataset comprising 1.4 trillion tokens and refined through human preference alignment to enhance its conversational quality. The model is particularly suited for applications such as intelligent virtual assistants and localized chatbots, where low-latency inference and bilingual proficiency are primary requirements. Its open-weights nature and support for INT4 quantization further expand its utility for local deployment and integration into specialized NLP pipelines.
ChatGLM series models from Z.ai, based on GLM architecture.
排名
#103
| 基准 | 分数 | 排名 |
|---|---|---|
Web Development WebDev Arena | 1024 | 64 |