ApX logoApX logo

ChatGLM3-6B-32K

Parameters

6B

Context Length

32.768K

Modality

Text

Architecture

Dense

License

ChatGLM3-6B Model License

Release Date

27 Oct 2023

Knowledge Cutoff

-

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

4096

Number of Layers

28

Attention Heads

32

Key-Value Heads

2

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

ChatGLM3-6B-32K

ChatGLM3-6B-32K is an advanced large language model optimized for long-context understanding and generation. Developed through a collaboration between Zhipu AI and Tsinghua University's KEG Lab, this model serves as a specialized variant of the ChatGLM3-6B architecture, specifically engineered to extend the effective context window to 32,768 tokens. This expansion allows for the processing of comprehensive documents, long-form dialogues, and complex technical texts that exceed the limits of standard transformer-based models.

The model's architecture is built upon a 28-layer dense transformer framework. It incorporates several technical refinements to maintain stability and performance across its extended context, including the use of RMSNorm for normalization and Multi-Query Attention (MQA) to optimize inference efficiency. A significant innovation in this variant is the updated Rotary Position Embedding (RoPE) mechanism, which utilizes a modified base frequency (rope_ratio) to ensure precise positional resolution over 32K tokens. Furthermore, the model is trained with a specialized methodology that emphasizes long-text coherence during the conversation stage.

Designed for technical versatility, ChatGLM3-6B-32K natively supports tool invocation through function calling, code execution via an integrated code interpreter, and complex agent-based tasks. These features make it highly suitable for building sophisticated AI agents capable of deep text analysis and multi-step reasoning. The model's weights are open for academic research and available for free commercial use following a formal registration process, reflecting a commitment to accessible high-performance natural language processing.

About ChatGLM

ChatGLM series models from Z.ai, based on GLM architecture.


Other ChatGLM Models

Evaluation Benchmarks

No evaluation benchmarks for ChatGLM3-6B-32K available.

Rankings

Overall Rank

-

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
16k
32k

VRAM Required:

Recommended GPUs

ChatGLM3-6B-32K: Specifications and GPU VRAM Requirements