Phi-3-small: Specifications and GPU VRAM Requirements

Phi-3-small

开源

开放权重

参数

上下文长度

8.192K

模态

Text

架构

Dense

许可证

MIT License

发布日期

22 Apr 2024

训练数据截止日期

Oct 2023

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

4096

层数

注意力头

键值头

激活函数

归一化

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Phi-3-small

Microsoft's Phi-3-small is a member of the Phi family of small language models (SLMs), engineered to deliver high performance within a compact computational footprint. This model variant, with 7 billion parameters, is positioned for broad commercial and research applications where resource efficiency and responsiveness are critical. It addresses scenarios demanding robust language understanding, logical reasoning, and efficient processing on constrained hardware environments, including on-device deployments.

The underlying architecture of Phi-3-small is a dense, decoder-only Transformer. It incorporates several design choices aimed at optimizing performance and memory efficiency, notably leveraging Grouped Query Attention (GQA) where four query heads share a single key-value head, thereby reducing the KV cache footprint. Additionally, the model utilizes alternating layers of dense and blocksparse attention mechanisms, which further contribute to efficient memory management while preserving long-context retrieval capabilities. The training methodology includes a meticulous process of Supervised Fine-tuning (SFT) and Direct Preference Optimization (DPO), ensuring the model's alignment with human preferences and safety guidelines.

Phi-3-small is designed to operate with a default context length of 8,192 tokens (8K), with a further extended variant supporting up to 128,000 tokens through the application of LongRope technology. The model's training regimen involved an extensive dataset comprising 4.8 trillion tokens, derived from a combination of rigorously filtered public documents, high-quality educational content, and synthetically generated data, emphasizing data quality and reasoning density. This enables the model to excel in tasks such as complex language understanding, mathematical problem-solving, and code generation, making it suitable for deployment across various hardware platforms, from cloud-based inference to edge devices and mobile platforms.

关于 Phi-3

Microsoft's Phi-3 models are small language models designed for efficient operation on resource-constrained devices. They utilize a transformer decoder architecture and are trained on extensively filtered, high-quality data, including synthetic compositions. This approach enables a compact yet capable model family.

其他 Phi-3 模型

评估基准

排名适用于本地LLM。

排名

#28

基准	分数	排名
General Knowledge MMLU	0.56	20

排名

#28

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

所需显存:

资源

官方文档阅读论文下载权重源代码

Phi-3-small

技术规格

系统要求

Phi-3-small

关于 Phi-3

其他 Phi-3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源