ApX logoApX logo

ChatGLM3-6B

Parameters

6B

Context Length

8.192K

Modality

Text

Architecture

Dense

License

Apache 2.0

Release Date

27 Oct 2023

Knowledge Cutoff

Jul 2023

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

4096

Number of Layers

28

Attention Heads

32

Key-Value Heads

2

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

ChatGLM3-6B

ChatGLM3-6B is an advanced bilingual (Chinese-English) large language model developed through a collaboration between Zhipu AI and the Knowledge Engineering Group at Tsinghua University. As the third generation in the ChatGLM series, this model implements a refined General Language Model architecture that bridges the functional divide between autoencoding and autoregressive objectives. The pre-training phase utilizes a diverse corpus comprising approximately one trillion tokens, optimized for conversational coherence and instruction following across multiple domains including mathematics, programming, and logical reasoning.

Technically, the model is built on a dense Transformer-based architecture featuring Multi-Head Attention and RoPE (Rotary Positional Embeddings) for efficient sequence handling. A significant advancement in the ChatGLM3 iteration is its native support for complex agent-centric workflows, including function calling and code execution via an integrated interpreter. This functionality is supported by a redesigned prompt format that facilitates structured interactions and multi-turn dialogue management, making it suitable for deployment in scenarios requiring autonomous task execution.

Designed for local and edge deployment, ChatGLM3-6B maintains a low computational footprint while delivering enhanced performance relative to its predecessors. It utilizes SwiGLU activation functions and RMSNorm for stable training, with a vocabulary expanded to support efficient bilingual tokenization. The model's versatility is demonstrated through its ability to handle a variety of downstream applications, from standard question-answering to sophisticated agentic behaviors, all while operating within a context window optimized for standard conversational tasks.

About ChatGLM

ChatGLM series models from Z.ai, based on GLM architecture.


Other ChatGLM Models

Evaluation Benchmarks

Rank

#102

BenchmarkScoreRank

Web Development

WebDev Arena

1056

63

Rankings

Overall Rank

#102

Coding Rank

#93

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
4k
8k

VRAM Required:

Recommended GPUs