ApX logoApX logo

Ministral 3 8B

Parameters

8B

Context Length

256K

Modality

Multimodal

Architecture

Dense

License

Apache 2.0

Release Date

2 Dec 2025

Knowledge Cutoff

-

Technical Specifications

Attention

Attention Structure

Multi-Head Attention

Attention Heads

32

Key-Value Heads

8

Attention Head Dimension

128

Position Embedding

Absolute Position Embedding

RoPE Theta

1,000,000

Sliding Window Attention

No

Sliding Window Size

-

Normalization

RMS Normalization

Activation Function

Swish

Dimensions

Hidden Dimension Size

4,096

Number of Layers

32

FFN Intermediate Size (Dense)

14,336

Multi-Token Prediction Heads

-

Tokenizer

Vocabulary Size

131,072

Architecture Diagram

Input TokensToken EmbeddingPosition: AbsoluteHidden: 4.1k · Context: 256K · Vocab: 131.1kx 32 layersRMSNormPre-AttentionMulti-Head Attention32Q / 8KV headsHead dim: 128+RMSNormPre-FFNFeed-Forward NetworkSwishIntermediate: 14.3k+Final RMSNormOutput Logits

Ministral 3 8B

The Ministral 3 8B model is a member of the Ministral 3 family, developed by Mistral AI, engineered to provide advanced multimodal and multilingual capabilities for edge and resource-constrained environments. This model incorporates 8.4 billion language model parameters complemented by a 0.4 billion vision encoder, totaling 8.8 billion parameters, distinguishing it as a balanced and efficient solution for localized AI deployments. It is designed for versatility, supporting a range of applications from real-time chat interfaces to sophisticated agentic workflows.

Architecturally, Ministral 3 8B is a dense transformer model featuring 32 hidden layers and a hidden dimension size of 4096. Its attention mechanism utilizes 32 attention heads with 8 key-value heads, indicating the use of Grouped Query Attention (GQA) for efficient processing. The model employs Rotary Position Embeddings (RoPE) for handling sequence length and uses a SwiGLU (SiLU) activation function, alongside RMS Normalization for stable training and inference. The architecture is optimized for performance in scenarios where computational resources are limited, supporting an extensive context length of 256,000 tokens.

Ministral 3 8B is equipped with native multimodal understanding, enabling it to process and interpret both text and visual inputs. It offers robust multilingual support, proficient across numerous languages including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, and Korean. The model further integrates native function calling capabilities and supports JSON output, facilitating integration into various agentic systems and automated workflows. These characteristics make it suitable for applications such as image and document description, local AI assistants, and specialized problem-solving in embedded systems.

About Ministral 3

Ministral 3 is a family of efficient edge models with vision capabilities, available in 3B, 8B, and 14B parameter sizes. Designed for edge deployment with multimodal and multilingual support, offering best-in-class performance for resource-constrained environments.


Other Ministral 3 Models

Evaluation Benchmarks

Rank

#92

BenchmarkScoreRank

General Knowledge

MMLU

0.761

25

Rankings

Overall Rank

#92

Coding Rank

-

Model Integrity

Total Score

B+

71 / 100

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
125k
250k

VRAM Required:

Recommended GPUs

Ministral 3 8B: Specifications and GPU VRAM Requirements