趋近智
参数
9.2B
上下文长度
8.192K
模态
Text
架构
Dense
许可证
Gemma-Community
发布日期
14 Nov 2024
训练数据截止日期
Mar 2024
注意力结构
Multi-Head Attention
隐藏维度大小
3584
层数
42
注意力头
16
键值头
8
激活函数
Gated GELU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Sahabat-AI-Gemma2-9B-Instruct is a specialized large language model developed through a strategic collaboration between GoTo Group, Indosat Ooredoo Hutchison, and AI Singapore. Built upon the Google Gemma 2 architecture, this variant is the result of continued pre-training (CPT) and intensive instruction tuning specifically tailored for the Indonesian linguistic ecosystem. It is engineered to provide high-fidelity conversational capabilities not only in standard Bahasa Indonesia but also in major regional dialects, including Javanese and Sundanese, addressing the cultural and linguistic nuances inherent to the Indonesian archipelago.
The underlying architecture follows a decoder-only transformer design that incorporates several modern refinements for efficiency and stability. It utilizes Grouped-Query Attention (GQA) to optimize inference throughput and memory bandwidth, which is particularly effective for maintaining performance during long-context processing. For training stability and representational accuracy, the model employs RMSNorm for pre- and post-normalization across layers and integrates logit soft-capping to prevent divergence. The instruction-tuning phase involved a supervised fine-tuning process using a localized dataset of over 600,000 instruction-completion pairs, followed by on-policy alignment and model merging to refine its response quality and adherence to complex prompts.
Technically, the model is optimized for a wide array of natural language processing tasks, including sentiment analysis, toxicity detection, causal reasoning, and abstractive summarization within Southeast Asian contexts. By leveraging the base Gemma 2 9B weights, it inherits a robust world-knowledge foundation while specializing in regional idioms and cultural contexts that are often underrepresented in global models. This makes it a suitable candidate for developers building localized digital assistants, automated customer service interfaces, and educational tools designed for the Indonesian market.
Sahabat-AI is an Indonesian language model family co-initiated by GoTo and Indosat Ooredoo Hutchison. Developed with AI Singapore and NVIDIA, it is a collection of models (based on Gemma 2 and Llama 3) specifically optimized for Bahasa Indonesia and regional languages like Javanese and Sundanese.
没有可用的 Sahabat-AI-Gemma2-9B-Instruct 评估基准。