Parameters
3B
Context Length
256K
Modality
Multimodal
Architecture
Dense
License
Apache 2.0
Release Date
2 Dec 2025
Knowledge Cutoff
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
8
Activation Function
-
Normalization
Layer Normalization
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
The Ministral 3 3B model is a compact, multimodal language model developed by Mistral AI, specifically engineered for efficient deployment in resource-constrained environments such as edge devices. It integrates a 3.4 billion parameter language model with a 0.4 billion parameter vision encoder, resulting in a total of 3.8 billion parameters. This architecture facilitates advanced capabilities including multimodal understanding, allowing the model to process and interpret both text and visual inputs. Designed to operate with a minimal memory footprint, Ministral 3 3B is capable of running locally on devices with limited VRAM, enhancing its applicability for on-device inference and privacy-sensitive applications.
Architecturally, Ministral 3 3B is a dense Transformer model that incorporates Grouped Query Attention (GQA) to optimize processing speed and memory utilization. This attention mechanism contributes to the model's ability to efficiently handle long input sequences, supporting a context length of up to 256,000 tokens. The model's design includes 8 key-value heads, which aid in capturing complex relationships within input data while maintaining computational efficiency. These technical considerations ensure a balance between performance and the practical constraints of edge computing.
The Ministral 3 3B model is suitable for a range of lightweight, real-time applications, including image captioning, text classification, real-time translation, content generation, and data extraction on edge devices. Its inherent multimodal and multilingual capabilities, supporting dozens of languages, further broaden its applicability across diverse use cases requiring local intelligence. The model also offers robust support for agentic workflows, featuring native function calling and structured JSON output, making it effective for orchestrating multi-step tasks and specialized applications.
Ministral 3 is a family of efficient edge models with vision capabilities, available in 3B, 8B, and 14B parameter sizes. Designed for edge deployment with multimodal and multilingual support, offering best-in-class performance for resource-constrained environments.
Ranking is for Local LLMs.
No evaluation benchmarks for Ministral 3 3B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens