Phi-3-mini: Specifications and GPU VRAM Requirements

Phi-3-mini

开源

开放权重

参数

3.8B

上下文长度

4.096K

模态

Text

架构

Dense

许可证

MIT

发布日期

22 Apr 2024

训练数据截止日期

Oct 2023

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

3072

层数

注意力头

键值头

激活函数

归一化

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Phi-3-mini

Microsoft's Phi-3-mini is a lightweight, state-of-the-art small language model (SLM) designed to deliver high performance within resource-constrained environments, including mobile and edge devices. It is a foundational component of the Phi-3 model family, aiming to offer compelling capabilities at a significantly smaller scale compared to larger models. The model serves as a practical solution for scenarios where computational efficiency and reduced operational costs are paramount, thereby broadening the accessibility of advanced AI.

Architecturally, Phi-3-mini is a dense decoder-only Transformer model. Its training methodology is a key innovation, utilizing a meticulously curated dataset that is a scaled-up version of the one employed for Phi-2. This dataset comprises heavily filtered publicly available web data and synthetic "textbook-quality" data, intentionally designed to foster strong reasoning and knowledge acquisition. The model undergoes a rigorous post-training process, incorporating both Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to enhance instruction adherence, robustness, and safety alignment. It features a hidden dimension size of 3072, 32 layers, 32 attention heads, and leverages grouped-query attention (GQA) with 8 key-value heads.

Phi-3-mini is primarily intended for broad commercial and research applications that require strong reasoning abilities, particularly in areas such as mathematics and logic. Its compact size facilitates deployment in latency-bound scenarios and on hardware with limited memory and compute capabilities, such as mobile phones and IoT devices. The model is available in two context length variants: a default 4K token version and a 128K token version (Phi-3-mini-128K), which utilizes LongRope for extended context handling. These characteristics make it suitable for diverse use cases ranging from general-purpose AI systems to specialized applications where efficient local inference is a requirement.

关于 Phi-3

Microsoft's Phi-3 models are small language models designed for efficient operation on resource-constrained devices. They utilize a transformer decoder architecture and are trained on extensively filtered, high-quality data, including synthetic compositions. This approach enables a compact yet capable model family.

其他 Phi-3 模型

评估基准

排名适用于本地LLM。

排名

#28

基准	分数	排名
General Knowledge MMLU	0.52	20

排名

#28

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Phi-3-mini

技术规格

系统要求

Phi-3-mini

关于 Phi-3

其他 Phi-3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源