ApX 标志ApX 标志

趋近智

Sahabat-AI-Llama3-8B-Instruct

参数

8B

上下文长度

8.192K

模态

Text

架构

Dense

许可证

Llama-3.1-Community

发布日期

14 Nov 2024

训练数据截止日期

Mar 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

32

注意力头

32

键值头

8

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Sahabat-AI-Llama3-8B-Instruct

Sahabat-AI-Llama3-8B-Instruct is a specialized large language model developed through a collaboration between GoTo Group and Indosat Ooredoo Hutchison. This model is constructed using a continued pre-training (CPT) approach on the Meta Llama 3 architecture, specifically optimized to reflect the linguistic patterns and cultural context of Indonesia. By incorporating a significant corpus of Indonesian text and regional languages such as Javanese and Sundanese, the model provides localized language processing capabilities that account for regional idioms and social contexts.

The technical framework is a dense, decoder-only Transformer architecture comprising 32 layers and a hidden dimension of 4096. It employs Grouped Query Attention (GQA) with 32 query heads and 8 key-value heads to improve inference efficiency. The model utilizes Rotary Positional Embeddings (RoPE) for sequence modeling and SwiGLU activation functions within its feed-forward layers. Training was facilitated by the NVIDIA NeMo framework, allowing the weights to be refined on a dataset of approximately 50 billion tokens, followed by supervised fine-tuning on hundreds of thousands of instruction-completion pairs.

This instruction-tuned variant is designed for high-quality interactions in both formal and informal Indonesian. It addresses specific cultural sensitivities and linguistic variations that are often missing in general-purpose global models. Primary applications include automated customer support for the Indonesian market, localized content synthesis, and technical assistance within the regional digital ecosystem. The model is compatible with the Transformers library and optimized for deployment on standardized accelerated computing infrastructure.

关于 Sahabat-AI

Sahabat-AI is an Indonesian language model family co-initiated by GoTo and Indosat Ooredoo Hutchison. Developed with AI Singapore and NVIDIA, it is a collection of models (based on Gemma 2 and Llama 3) specifically optimized for Bahasa Indonesia and regional languages like Javanese and Sundanese.


其他 Sahabat-AI 模型

评估基准

没有可用的 Sahabat-AI-Llama3-8B-Instruct 评估基准。

排名

排名

-

编程排名

-

模型透明度

总分

B

67 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
4k
8k

所需显存:

推荐 GPU

Sahabat-AI-Llama3-8B-Instruct:规格和 GPU 显存要求