Magistral Small：规格和 GPU 显存要求

Magistral Small

开源

开放权重

参数

24B

上下文长度

128K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

10 Jun 2025

训练数据截止日期

Oct 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

14336

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Magistral Small

Magistral Small is an open-source reasoning model developed by Mistral AI, comprising 24 billion parameters. It is architecturally founded upon the Mistral Small 3.1 model and is specifically engineered to perform transparent, multi-step reasoning. This model provides traceable thought processes in the user's language, a feature designed to enhance interpretability and auditability for complex tasks. It supports multilingual reasoning across more than 24 languages, including widely used global languages such as English, French, German, Japanese, Korean, Chinese, Arabic, and Farsi.

From a technical perspective, Magistral Small employs a decoder-only transformer architecture with a hidden dimension size of 14,336 across its 32 layers. The model utilizes Grouped Query Attention (GQA) with 32 attention heads and 8 key-value heads, which contributes to optimized inference speed and reduced memory consumption compared to traditional Multi-Head Attention. Positional information is integrated using Rotary Positional Embeddings (RoPE), and the network's feedforward components incorporate SwiGLU activation functions in conjunction with RMS Normalization for stabilized training dynamics. The architecture also integrates FlashAttention for accelerated processing. While supporting a theoretical context window of 128,000 tokens, optimal performance is typically observed with contexts up to 40,000 tokens.

Magistral Small is proficient in multimodal comprehension, enabling it to process and reason over both textual and visual inputs. It is particularly suited for applications requiring structured calculations, programmatic logic, decision trees, and rule-based systems. The model's design facilitates its use in various scenarios, including fast-response conversational agents, systems for long document understanding, visual understanding applications, and specialized domain-specific fine-tuning. Its capabilities extend to supporting agentic AI workflows through native function calling and structured output generation.

关于 Magistral

Magistral is Mistral AI's first reasoning model series, purpose-built for transparent, step-by-step reasoning with native multilingual capabilities. Features chain-of-thought reasoning in the user's language with traceable thought processes. Excels in domain-specific problems requiring multi-step logic, from legal research and financial forecasting to software development and creative storytelling. Supports reasoning across numerous languages including English, French, Spanish, German, Italian, Arabic, Russian, and Chinese.

其他 Magistral 模型

没有相关模型

评估基准

没有可用的 Magistral Small 评估基准。

排名

编程排名

模型透明度

总分

B+

75 / 100

上游

21.5 / 30

模型

28.5 / 40

下游

24.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

63k

125k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Magistral Small

技术规格

Magistral Small

关于 Magistral

其他 Magistral 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源