ApX 标志ApX 标志

趋近智

Sahabat-AI-Gemma2-9B

参数

9.2B

上下文长度

8.192K

模态

Text

架构

Dense

许可证

Gemma-Community

发布日期

14 Nov 2024

训练数据截止日期

-

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

3584

层数

42

注意力头

16

键值头

8

激活函数

Gated GELU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Sahabat-AI-Gemma2-9B

Sahabat-AI-Gemma2-9B is a specialized large language model designed to handle the linguistic complexities of the Indonesian archipelago, including regional dialects such as Javanese and Sundanese. Developed through a collaboration between GoTo and Indosat Ooredoo Hutchison, with technical support from AI Singapore and NVIDIA, the model is built upon the Gemma 2 9B architecture. It undergoes a rigorous continued pre-training (CPT) phase using approximately 50 billion tokens of Indonesian-centric data. This localized training enables the model to capture deep cultural context and grammatical nuances that are often lost in general-purpose multilingual models.

The technical architecture follows the dense decoder-only transformer design of Gemma 2, incorporating significant optimizations for inference efficiency and training stability. It utilizes Grouped-Query Attention (GQA) with 16 query heads and 8 key-value heads, effectively reducing memory bandwidth requirements during generation. A hallmark of this architecture is the interleaving of global and local sliding window attention layers, which balances long-range dependency modeling with computational performance. The model employs the GeGLU activation function and implements a hybrid normalization scheme using RMSNorm in both pre-norm and post-norm configurations to maintain signal integrity across its 42 layers.

Positioned for deployment in diverse Indonesian applications, Sahabat-AI-Gemma2-9B is engineered for tasks such as multilingual question answering, sentiment analysis, and translation. It utilizes Rotary Position Embeddings (RoPE) and features logit soft-capping to prevent gradient explosion during training and improve overall generation quality. As an open-weights release under the Gemma Community License, it provides a foundational resource for developers to build localized AI services, ranging from enterprise-grade virtual assistants to educational tools optimized for Indonesia's unique digital landscape.

关于 Sahabat-AI

Sahabat-AI is an Indonesian language model family co-initiated by GoTo and Indosat Ooredoo Hutchison. Developed with AI Singapore and NVIDIA, it is a collection of models (based on Gemma 2 and Llama 3) specifically optimized for Bahasa Indonesia and regional languages like Javanese and Sundanese.


其他 Sahabat-AI 模型

评估基准

没有可用的 Sahabat-AI-Gemma2-9B 评估基准。

排名

排名

-

编程排名

-

模型透明度

总分

B+

73 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
4k
8k

所需显存:

推荐 GPU