趋近智
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
不同量化方法和上下文大小的显存要求
MaLLaM-3B is a 3 billion parameter Malaysian language model designed for edge deployment. It is bilingual in Bahasa Malaysia and English, trained on Malaysian digital content and literature. The model supports local idioms and cultural references. Released under the Apache 2.0 license.
Malaysian Large Language Model (MaLLaM) is an open-source language model family developed to support Bahasa Malaysia and English. The model is trained on Malaysian text data including local news, literature, and digital content. It is designed to process Malaysian linguistic nuances and cultural context, available in multiple parameter sizes for different hardware deployments.
没有可用的 MaLLaM-3B 评估基准。