Parameters
6B
Context Length
32.768K
Modality
Text
Architecture
Dense
License
ChatGLM3-6B Model License
Release Date
27 Oct 2023
Knowledge Cutoff
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
4096
Number of Layers
28
Attention Heads
32
Key-Value Heads
2
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
ChatGLM3-6B-32K is an advanced large language model optimized for long-context understanding and generation. Developed through a collaboration between Zhipu AI and Tsinghua University's KEG Lab, this model serves as a specialized variant of the ChatGLM3-6B architecture, specifically engineered to extend the effective context window to 32,768 tokens. This expansion allows for the processing of comprehensive documents, long-form dialogues, and complex technical texts that exceed the limits of standard transformer-based models.
The model's architecture is built upon a 28-layer dense transformer framework. It incorporates several technical refinements to maintain stability and performance across its extended context, including the use of RMSNorm for normalization and Multi-Query Attention (MQA) to optimize inference efficiency. A significant innovation in this variant is the updated Rotary Position Embedding (RoPE) mechanism, which utilizes a modified base frequency (rope_ratio) to ensure precise positional resolution over 32K tokens. Furthermore, the model is trained with a specialized methodology that emphasizes long-text coherence during the conversation stage.
Designed for technical versatility, ChatGLM3-6B-32K natively supports tool invocation through function calling, code execution via an integrated code interpreter, and complex agent-based tasks. These features make it highly suitable for building sophisticated AI agents capable of deep text analysis and multi-step reasoning. The model's weights are open for academic research and available for free commercial use following a formal registration process, reflecting a commitment to accessible high-performance natural language processing.
ChatGLM series models from Z.ai, based on GLM architecture.
No evaluation benchmarks for ChatGLM3-6B-32K available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens