Parameters
1.3B
Context Length
2.048K
Modality
Text
Architecture
Dense
License
Apache-2.0
Release Date
29 Feb 2024
Knowledge Cutoff
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
-
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
CroissantLLM Base is a 1.3 billion parameter bilingual French-English model. It was trained on 3 trillion tokens with a 1:1 ratio of French and English data. The model architecture is based on Llama and is released under the Apache 2.0 license.
CroissantLLM is a bilingual French-English language model developed by French research institutions. The model is trained on a curated mix of French and English data to provide language understanding while preserving French linguistic heritage. It is designed for low-resource inference on consumer-grade hardware.
No evaluation benchmarks for CroissantLLM Base available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens