Ministral 3 14B：规格和 GPU 显存要求

Ministral 3 14B

开源

开放权重

参数

14B

上下文长度

256K

模态

Multimodal

架构

Dense

许可证

Apache 2.0

发布日期

2 Dec 2025

训练数据截止日期

Jun 2025

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Ministral 3 14B

Ministral 3 14B is a high-density, multimodal transformer model engineered by Mistral AI to bridge the gap between edge-efficient computing and frontier-class intelligence. As the largest member of the Ministral 3 family, it employs a sophisticated Cascade Distillation strategy, where knowledge is progressively transferred from larger parent models, such as Mistral Small 3.1, into a more compact 14-billion-parameter footprint. This architecture integrates a 13.5-billion-parameter decoder-only language core with a frozen 410-million-parameter Vision Transformer (ViT) encoder, enabling the model to process interleaved image and text inputs with high precision.

The technical foundation of the model features 40 transformer layers and a hidden dimension of 5120, utilizing Grouped Query Attention (GQA) with 32 query heads and 8 key-value heads to optimize memory throughput during inference. It incorporates modern architectural best practices, including RMSNorm for stable normalization, SwiGLU activation functions for enhanced non-linear processing, and Rotary Positional Embeddings (RoPE) enhanced by YaRN scaling. These components collectively support an expansive context window of 256,000 tokens, allowing for the ingestion of massive document sets or complex multi-turn agentic workflows without performance degradation.

Designed for sophisticated automation and private AI deployments, Ministral 3 14B excels in agentic tasks through native support for function calling and structured JSON outputs. Its training emphasizes efficiency and versatility, providing robust multilingual capabilities across more than 40 languages and high-tier performance in reasoning-heavy domains like mathematics and coding. By balancing a dense architectural structure with advanced quantization compatibility, the model is optimized for deployment on local workstations and enterprise edge hardware, offering a high-performance alternative to much larger cloud-based systems.

关于 Ministral 3

Ministral 3 is a family of efficient edge models with vision capabilities, available in 3B, 8B, and 14B parameter sizes. Designed for edge deployment with multimodal and multilingual support, offering best-in-class performance for resource-constrained environments.

其他 Ministral 3 模型

评估基准

没有可用的 Ministral 3 14B 评估基准。

排名

编程排名

模型透明度

总分

B+

73 / 100

上游

21.5 / 30

模型

27.5 / 40

下游

23.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

125k

250k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Ministral 3 14B

技术规格

Ministral 3 14B

关于 Ministral 3

其他 Ministral 3 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源