Parameters
8B
Context Length
128K
Modality
Text
Architecture
Dense
License
Apache-2.0
Release Date
1 Jun 2024
Knowledge Cutoff
Mar 2023
Attention Structure
Multi-Head Attention
Hidden Dimension Size
4096
Number of Layers
32
Attention Heads
32
Key-Value Heads
8
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
Typhoon-2-8B is a large language model specifically engineered to address the linguistic requirements of the Thai language while maintaining the broad capabilities of the Llama 3 architecture. Developed by SCB 10X, the model undergoes a specialized training process that involves extending the base tokenizer with Thai-specific tokens and performing continual pre-training on a high-quality Thai corpus. This adaptation ensures that the model can process Thai text with higher efficiency and accuracy compared to general-purpose multilingual models, particularly in domains such as Thai law, local administration, and cultural contexts.
The technical architecture follows a dense transformer structure utilizing Grouped-Query Attention (GQA) to optimize inference speed and memory consumption. It incorporates Rotary Positional Embeddings (RoPE) and is configured with a context window of 128,000 tokens, enabling the processing of long-form documents and complex multi-turn conversations. The model utilizes the SwiGLU activation function and Root Mean Square Layer Normalization (RMSNorm) to stabilize training and improve representation learning across its 32 layers.
Function calling capabilities are integrated into the model, allowing it to interact with external tools and APIs by generating structured data outputs. This functionality makes it suitable for agentic workflows, automated administrative tasks, and specialized information retrieval systems where precise Thai language understanding is required. The model is released under the Apache 2.0 license, facilitating both research and commercial applications in the Thai technology ecosystem.
Typhoon is a Thai language model family developed by SCB 10X. It is specifically optimized for the Thai language, addressing complexities such as the lack of word delimiters and tonal nuances. The models are trained on Thai-centric datasets including legal, cultural, and historical documents to ensure localized context and knowledge.
No evaluation benchmarks for Typhoon-2-8B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens