ApX logoApX logo

Typhoon-2-8B

Parameters

8B

Context Length

128K

Modality

Text

Architecture

Dense

License

Apache-2.0

Release Date

1 Jun 2024

Knowledge Cutoff

Mar 2023

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

4096

Number of Layers

32

Attention Heads

32

Key-Value Heads

8

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

Typhoon-2-8B

Typhoon-2-8B is a large language model specifically engineered to address the linguistic requirements of the Thai language while maintaining the broad capabilities of the Llama 3 architecture. Developed by SCB 10X, the model undergoes a specialized training process that involves extending the base tokenizer with Thai-specific tokens and performing continual pre-training on a high-quality Thai corpus. This adaptation ensures that the model can process Thai text with higher efficiency and accuracy compared to general-purpose multilingual models, particularly in domains such as Thai law, local administration, and cultural contexts.

The technical architecture follows a dense transformer structure utilizing Grouped-Query Attention (GQA) to optimize inference speed and memory consumption. It incorporates Rotary Positional Embeddings (RoPE) and is configured with a context window of 128,000 tokens, enabling the processing of long-form documents and complex multi-turn conversations. The model utilizes the SwiGLU activation function and Root Mean Square Layer Normalization (RMSNorm) to stabilize training and improve representation learning across its 32 layers.

Function calling capabilities are integrated into the model, allowing it to interact with external tools and APIs by generating structured data outputs. This functionality makes it suitable for agentic workflows, automated administrative tasks, and specialized information retrieval systems where precise Thai language understanding is required. The model is released under the Apache 2.0 license, facilitating both research and commercial applications in the Thai technology ecosystem.

About Typhoon

Typhoon is a Thai language model family developed by SCB 10X. It is specifically optimized for the Thai language, addressing complexities such as the lack of word delimiters and tonal nuances. The models are trained on Thai-centric datasets including legal, cultural, and historical documents to ensure localized context and knowledge.


Other Typhoon Models

Evaluation Benchmarks

No evaluation benchmarks for Typhoon-2-8B available.

Rankings

Overall Rank

-

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
63k
125k

VRAM Required:

Recommended GPUs