ApX logo

Ministral 3 3B

Parameters

3B

Context Length

256K

Modality

Multimodal

Architecture

Dense

License

Apache 2.0

Release Date

2 Dec 2025

Knowledge Cutoff

-

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

-

Number of Layers

-

Attention Heads

-

Key-Value Heads

8

Activation Function

-

Normalization

Layer Normalization

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

Ministral 3 3B

The Ministral 3 3B model is a compact, multimodal language model developed by Mistral AI, specifically engineered for efficient deployment in resource-constrained environments such as edge devices. It integrates a 3.4 billion parameter language model with a 0.4 billion parameter vision encoder, resulting in a total of 3.8 billion parameters. This architecture facilitates advanced capabilities including multimodal understanding, allowing the model to process and interpret both text and visual inputs. Designed to operate with a minimal memory footprint, Ministral 3 3B is capable of running locally on devices with limited VRAM, enhancing its applicability for on-device inference and privacy-sensitive applications.

Architecturally, Ministral 3 3B is a dense Transformer model that incorporates Grouped Query Attention (GQA) to optimize processing speed and memory utilization. This attention mechanism contributes to the model's ability to efficiently handle long input sequences, supporting a context length of up to 256,000 tokens. The model's design includes 8 key-value heads, which aid in capturing complex relationships within input data while maintaining computational efficiency. These technical considerations ensure a balance between performance and the practical constraints of edge computing.

The Ministral 3 3B model is suitable for a range of lightweight, real-time applications, including image captioning, text classification, real-time translation, content generation, and data extraction on edge devices. Its inherent multimodal and multilingual capabilities, supporting dozens of languages, further broaden its applicability across diverse use cases requiring local intelligence. The model also offers robust support for agentic workflows, featuring native function calling and structured JSON output, making it effective for orchestrating multi-step tasks and specialized applications.

About Ministral 3

Ministral 3 is a family of efficient edge models with vision capabilities, available in 3B, 8B, and 14B parameter sizes. Designed for edge deployment with multimodal and multilingual support, offering best-in-class performance for resource-constrained environments.


Other Ministral 3 Models

Evaluation Benchmarks

Ranking is for Local LLMs.

No evaluation benchmarks for Ministral 3 3B available.

Rankings

Overall Rank

-

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
125k
250k

VRAM Required:

Recommended GPUs