Phi-3-medium: Specifications and GPU VRAM Requirements

Phi-3-medium

开源

开放权重

参数

14B

上下文长度

128K

模态

Text

架构

Dense

许可证

MIT

发布日期

22 Apr 2024

知识截止

Oct 2023

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

5120

层数

注意力头

键值头

激活函数

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Phi-3-medium

Phi-3-medium is a compact, high-performance large language model developed by Microsoft, belonging to the Phi-3 family of models. With 14 billion parameters, it is designed for a broad array of commercial and research applications, particularly those operating within memory or compute-constrained environments and latency-sensitive scenarios. This model aims to provide strong reasoning capabilities, notably in mathematics, logic, and code generation, positioning it as a foundational component for developing generative artificial intelligence features.

The training methodology for Phi-3-medium leverages a high-quality, reasoning-dense dataset, which is a refined and scaled version of the data utilized for its predecessor, Phi-2. This dataset incorporates both meticulously filtered publicly available web content and synthetically generated data, ensuring a robust and instruction-adherent model. The training process includes supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance its ability to follow instructions precisely and to reinforce safety measures.

The model employs a dense decoder-only Transformer architecture, a common and effective structure for autoregressive language modeling tasks. Its internal mechanisms include Grouped Query Attention (GQA) for efficient memory utilization and processing, Root Mean Square (RMS) normalization for stable training, and Rotary Positional Embeddings (RoPE) to handle positional information within sequences. A specific variant of RoPE, known as LongRope, facilitates the model's capacity to process extended context lengths up to 128,000 tokens. Phi-3-medium is optimized for deployment across diverse hardware, including graphics processing units (GPUs), central processing units (CPUs), and mobile devices, often leveraging technologies like ONNX Runtime and DirectML for cross-platform compatibility and efficient inference.

关于 Phi-3

Microsoft's Phi-3 models are small language models designed for efficient operation on resource-constrained devices. They utilize a transformer decoder architecture and are trained on extensively filtered, high-quality data, including synthetic compositions. This approach enables a compact yet capable model family.

其他 Phi-3 模型

评估基准

排名适用于本地LLM。

排名

#21

基准	分数	排名
General Knowledge MMLU	0.66	14

排名

#21

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

63k

125k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Phi-3-medium

技术规格

系统要求

Phi-3-medium

关于 Phi-3

其他 Phi-3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源