ApX 标志

趋近智

Phi-4

参数

14B

上下文长度

16K

模态

Text

架构

Dense

许可证

MIT License

发布日期

13 Dec 2024

知识截止

Nov 2024

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

3072

层数

40

注意力头

24

键值头

8

激活函数

-

归一化

-

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Phi-4

Microsoft Phi-4 is a 14 billion parameter decoder-only Transformer model, developed as the latest iteration in Microsoft's series of small language models (SLMs). The model's primary objective is to deliver advanced reasoning capabilities efficiently, enabling deployment in environments with limited compute and memory, and for latency-sensitive applications. Phi-4 is designed to handle complex logical and mathematical tasks, along with general language processing, by focusing on the quality of its training data rather than solely on model scale.

A key innovation in Phi-4's architecture and training methodology lies in its strategic use of high-quality synthetic data, which constitutes a significant portion of its training corpus. This synthetic data, generated using techniques such as multi-agent prompting, instruction reversal, and self-revision workflows, is complemented by meticulously curated organic data from web content, academic books, and code repositories. This approach enables Phi-4 to acquire strong reasoning and problem-solving abilities, often surpassing models with larger parameter counts. The model's architecture retains a similar structure to its predecessor, Phi-3, but includes enhancements such as an extended context length.

Phi-4 supports a 16,000-token context length, allowing it to process and generate extensive long-form content. Its design prioritizes efficiency and robust performance in tasks requiring logical deduction, code generation, and scientific understanding. The model is intended for research and development, serving as a foundational component for generative AI features in various applications, particularly those demanding strong reasoning in resource-constrained or low-latency scenarios.

关于 Phi-4

The Microsoft Phi-4 model family comprises small language models prioritizing efficient, high-capability reasoning. Its development emphasizes robust data quality and sophisticated synthetic data integration. This approach enables enhanced performance and on-device deployment capabilities.


其他 Phi-4 模型

评估基准

排名适用于本地LLM。

排名

#36

基准分数排名

Professional Knowledge

MMLU Pro

0.70

10

Graduate-Level QA

GPQA

0.56

10

0.39

17

General Knowledge

MMLU

0.56

18

0.43

21

0.29

24

0.45

26

排名

排名

#36

编程排名

#33

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
8k
16k

所需显存:

推荐 GPU