ApX 标志ApX 标志

趋近智

Sahabat-AI-Gemma2-9B-Instruct

参数

9.2B

上下文长度

8.192K

模态

Text

架构

Dense

许可证

Gemma-Community

发布日期

14 Nov 2024

训练数据截止日期

Mar 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

3584

层数

42

注意力头

16

键值头

8

激活函数

Gated GELU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Sahabat-AI-Gemma2-9B-Instruct

Sahabat-AI-Gemma2-9B-Instruct is a specialized large language model developed through a strategic collaboration between GoTo Group, Indosat Ooredoo Hutchison, and AI Singapore. Built upon the Google Gemma 2 architecture, this variant is the result of continued pre-training (CPT) and intensive instruction tuning specifically tailored for the Indonesian linguistic ecosystem. It is engineered to provide high-fidelity conversational capabilities not only in standard Bahasa Indonesia but also in major regional dialects, including Javanese and Sundanese, addressing the cultural and linguistic nuances inherent to the Indonesian archipelago.

The underlying architecture follows a decoder-only transformer design that incorporates several modern refinements for efficiency and stability. It utilizes Grouped-Query Attention (GQA) to optimize inference throughput and memory bandwidth, which is particularly effective for maintaining performance during long-context processing. For training stability and representational accuracy, the model employs RMSNorm for pre- and post-normalization across layers and integrates logit soft-capping to prevent divergence. The instruction-tuning phase involved a supervised fine-tuning process using a localized dataset of over 600,000 instruction-completion pairs, followed by on-policy alignment and model merging to refine its response quality and adherence to complex prompts.

Technically, the model is optimized for a wide array of natural language processing tasks, including sentiment analysis, toxicity detection, causal reasoning, and abstractive summarization within Southeast Asian contexts. By leveraging the base Gemma 2 9B weights, it inherits a robust world-knowledge foundation while specializing in regional idioms and cultural contexts that are often underrepresented in global models. This makes it a suitable candidate for developers building localized digital assistants, automated customer service interfaces, and educational tools designed for the Indonesian market.

关于 Sahabat-AI

Sahabat-AI is an Indonesian language model family co-initiated by GoTo and Indosat Ooredoo Hutchison. Developed with AI Singapore and NVIDIA, it is a collection of models (based on Gemma 2 and Llama 3) specifically optimized for Bahasa Indonesia and regional languages like Javanese and Sundanese.


其他 Sahabat-AI 模型

评估基准

没有可用的 Sahabat-AI-Gemma2-9B-Instruct 评估基准。

排名

排名

-

编程排名

-

模型透明度

总分

B+

73 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
4k
8k

所需显存:

推荐 GPU

Sahabat-AI-Gemma2-9B-Instruct:规格和 GPU 显存要求