趋近智
参数
8B
上下文长度
8.192K
模态
Text
架构
Dense
许可证
Llama-3.1-Community
发布日期
14 Nov 2024
训练数据截止日期
Mar 2023
注意力结构
Multi-Head Attention
隐藏维度大小
4096
层数
32
注意力头
32
键值头
8
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Sahabat-AI-Llama3-8B-Instruct is a specialized large language model developed through a collaboration between GoTo Group and Indosat Ooredoo Hutchison. This model is constructed using a continued pre-training (CPT) approach on the Meta Llama 3 architecture, specifically optimized to reflect the linguistic patterns and cultural context of Indonesia. By incorporating a significant corpus of Indonesian text and regional languages such as Javanese and Sundanese, the model provides localized language processing capabilities that account for regional idioms and social contexts.
The technical framework is a dense, decoder-only Transformer architecture comprising 32 layers and a hidden dimension of 4096. It employs Grouped Query Attention (GQA) with 32 query heads and 8 key-value heads to improve inference efficiency. The model utilizes Rotary Positional Embeddings (RoPE) for sequence modeling and SwiGLU activation functions within its feed-forward layers. Training was facilitated by the NVIDIA NeMo framework, allowing the weights to be refined on a dataset of approximately 50 billion tokens, followed by supervised fine-tuning on hundreds of thousands of instruction-completion pairs.
This instruction-tuned variant is designed for high-quality interactions in both formal and informal Indonesian. It addresses specific cultural sensitivities and linguistic variations that are often missing in general-purpose global models. Primary applications include automated customer support for the Indonesian market, localized content synthesis, and technical assistance within the regional digital ecosystem. The model is compatible with the Transformers library and optimized for deployment on standardized accelerated computing infrastructure.
Sahabat-AI is an Indonesian language model family co-initiated by GoTo and Indosat Ooredoo Hutchison. Developed with AI Singapore and NVIDIA, it is a collection of models (based on Gemma 2 and Llama 3) specifically optimized for Bahasa Indonesia and regional languages like Javanese and Sundanese.
没有可用的 Sahabat-AI-Llama3-8B-Instruct 评估基准。