SEA-LION-7B：规格和 GPU 显存要求

SEA-LION-7B

开源

开放权重

参数

7.1B

上下文长度

2.048K

模态

Text

架构

Dense

许可证

Apache-2.0

发布日期

1 Dec 2023

训练数据截止日期

Sep 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

注意力头

键值头

激活函数

GELU

归一化

Layer Normalization

位置嵌入

Absolute Position Embedding

SEA-LION-7B

SEA-LION-7B (Southeast Asian Languages In One Network) is a 7.1 billion parameter decoder-only transformer model developed by AI Singapore to address the linguistic and cultural specificities of the Southeast Asian region. Built on the MosaicML Pretrained Transformer (MPT) architecture, the model is trained from scratch on a massive 980 billion token corpus. This training set is uniquely balanced, featuring significant representation for 11 regional languages including Indonesian, Malay, Thai, Vietnamese, Filipino, Tamil, Burmese, Khmer, and Lao, alongside English and Chinese, ensuring the model captures regional nuances often overlooked by Western-centric LLMs.

Technically, SEA-LION-7B diverges from standard MPT configurations by utilizing absolute learned positional embeddings rather than ALiBi, which provides a stable foundation for its 2,048-token context window. The architecture consists of 32 transformer layers with a hidden dimension of 4096 and 32 attention heads. It employs Low-Precision LayerNorm for normalization and uses the GeLU (Gaussian Error Linear Unit) activation function. A critical innovation is the SEABPETokenizer, a custom Byte-Pair Encoding tokenizer with a 256,000-token vocabulary specifically optimized to reduce the token-to-word ratio for Southeast Asian scripts, thereby improving inference efficiency and comprehension.

Designed for research and regional application deployment, SEA-LION-7B serves as a base for specialized natural language understanding and generation tasks. Its performance characteristics are tailored for multilingual translation, sentiment analysis, and culturally aware text generation within the ASEAN context. The model's open-weights release under the MIT license encourages community-driven fine-tuning and adaptation for specific regional industrial use cases while maintaining a transparent and accessible framework for researchers and developers.

关于 SEA-LION

Southeast Asian Languages In One Network (SEA-LION) is a family of language models developed by AI Singapore for Southeast Asian languages. The models support English, Indonesian, Malay, Thai, Vietnamese, Tagalog, Burmese, Khmer, Lao, Tamil, and Chinese. It focuses on regional linguistic patterns and is available in base and instruction-tuned variants.

其他 SEA-LION 模型

SEA-LION-7B-Instruct

评估基准

没有可用的 SEA-LION-7B 评估基准。

排名

编程排名

模型透明度

总分

B+

75 / 100

上游

24.5 / 30

模型

30.0 / 40

下游

20.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

SEA-LION-7B

技术规格

SEA-LION-7B

关于 SEA-LION

其他 SEA-LION 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源