Parameters
6B
Context Length
2.048K
Modality
Text
Architecture
Dense
License
Apache 2.0
Release Date
14 Mar 2023
Knowledge Cutoff
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
4096
Number of Layers
28
Attention Heads
32
Key-Value Heads
32
Activation Function
GELU
Normalization
Layer Normalization
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
ChatGLM-6B is an open-source, bilingual (Chinese and English) dialogue language model developed by Tsinghua University's KEG Lab and Zhipu AI. It is built upon the General Language Model (GLM) architecture. The model's primary objective is to facilitate conversational AI tasks, with a specific optimization for Chinese question answering and dialogue. A key design consideration for ChatGLM-6B was its accessibility for local deployment on consumer-grade hardware, enabling operation with as little as 6GB of GPU memory when utilizing INT4 quantization.
The model employs a Transformer-based architecture, deriving its foundational design from the GLM framework. During its pre-training phase, ChatGLM-6B incorporated a hybrid objective function. The training regimen involved a substantial corpus of approximately 1 trillion tokens, comprising both Chinese and English languages. Furthermore, the development process integrated advanced techniques such as supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback to align the model's outputs with human preferences. The underlying GLM architecture supports a 2D positional encoding scheme.
Despite its relatively compact size of 6.2 billion parameters, ChatGLM-6B demonstrates capabilities in generating coherent and contextually relevant responses. Its architecture emphasizes computational efficiency, allowing for deployment and inference on common GPU configurations, which broadens its applicability for researchers and developers. The model is suitable for a range of natural language processing tasks, including but not limited to machine translation, general question answering systems, and the construction of interactive chatbot applications, particularly in bilingual contexts involving Chinese and English.
ChatGLM series models from Z.ai, based on GLM architecture.
Ranking is for Local LLMs.
No evaluation benchmarks for ChatGLM-6B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens