趋近智
参数
1.3B
上下文长度
2.048K
模态
Text
架构
Dense
许可证
Apache-2.0
发布日期
29 Feb 2024
训练数据截止日期
-
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
不同量化方法和上下文大小的显存要求
CroissantLLM Base is a 1.3 billion parameter bilingual French-English model. It was trained on 3 trillion tokens with a 1:1 ratio of French and English data. The model architecture is based on Llama and is released under the Apache 2.0 license.
CroissantLLM is a bilingual French-English language model developed by French research institutions. The model is trained on a curated mix of French and English data to provide language understanding while preserving French linguistic heritage. It is designed for low-resource inference on consumer-grade hardware.
没有可用的 CroissantLLM Base 评估基准。