Ministral 3 3B：规格和 GPU 显存要求

Ministral 3 3B

开源

开放权重

参数

上下文长度

256K

模态

Multimodal

架构

Dense

许可证

Apache 2.0

发布日期

2 Dec 2025

训练数据截止日期

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

3072

层数

注意力头

键值头

激活函数

SwigLU

归一化

Layer Normalization

位置嵌入

Absolute Position Embedding

Ministral 3 3B

Ministral 3 3B is a compact, multimodal language model engineered by Mistral AI for efficient execution in edge computing environments and resource-constrained scenarios. The model architecture integrates a 3.4 billion parameter language decoder with a 410 million parameter Vision Transformer (ViT) encoder, yielding a combined capacity of approximately 3.8 billion parameters. This hybrid design enables the simultaneous processing of text and visual inputs, facilitating advanced tasks such as image captioning, visual question answering, and multimodal data extraction while maintaining a low computational overhead.

Technically, Ministral 3 3B follows a dense Transformer-based decoder-only architecture that leverages Grouped Query Attention (GQA) with 32 query heads and 8 key-value heads to optimize memory bandwidth and inference speed. It employs Rotary Positional Embeddings (RoPE) enhanced with YaRN (Yet another RoPE extensioN) and position-based softmax temperature scaling to support an extensive context window of up to 256,000 tokens. To further enhance efficiency at this scale, the 3B variant utilizes tied input-output embeddings, preventing vocabulary parameters from disproportionately increasing the total model size. The vision component utilizes a frozen ViT encoder derived from the Mistral Small 3.1 architecture, coupled with a newly trained multimodal projection layer.

The model is optimized for high-performance on-device applications, offering native support for function calling and structured JSON output to enable complex agentic workflows. It incorporates architectural refinements such as SwiGLU activation and RMSNorm to ensure stability and efficiency during local inference. By supporting dozens of languages and featuring a high-context capacity, Ministral 3 3B is positioned as a versatile solution for real-time translation, local content generation, and privacy-focused intelligent assistants operating directly on user hardware.

关于 Ministral 3

Ministral 3 is a family of efficient edge models with vision capabilities, available in 3B, 8B, and 14B parameter sizes. Designed for edge deployment with multimodal and multilingual support, offering best-in-class performance for resource-constrained environments.

其他 Ministral 3 模型

评估基准

没有可用的 Ministral 3 3B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

125k

250k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Ministral 3 3B

技术规格

Ministral 3 3B

关于 Ministral 3

其他 Ministral 3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源