Phi-4: Specifications and GPU VRAM Requirements

Phi-4

Open Source

Open Weights

Parameters

14B

Context Length

16K

Modality

Text

Architecture

Dense

License

MIT License

Release Date

13 Dec 2024

Knowledge Cutoff

Nov 2024

Technical Specifications

Attention Structure

Grouped-Query Attention

Hidden Dimension Size

3072

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

Normalization

Position Embedding

ROPE

System Requirements

VRAM requirements for different quantization methods and context sizes

Phi-4

Microsoft Phi-4 is a 14 billion parameter decoder-only Transformer model, developed as the latest iteration in Microsoft's series of small language models (SLMs). The model's primary objective is to deliver advanced reasoning capabilities efficiently, enabling deployment in environments with limited compute and memory, and for latency-sensitive applications. Phi-4 is designed to handle complex logical and mathematical tasks, along with general language processing, by focusing on the quality of its training data rather than solely on model scale.

A key innovation in Phi-4's architecture and training methodology lies in its strategic use of high-quality synthetic data, which constitutes a significant portion of its training corpus. This synthetic data, generated using techniques such as multi-agent prompting, instruction reversal, and self-revision workflows, is complemented by meticulously curated organic data from web content, academic books, and code repositories. This approach enables Phi-4 to acquire strong reasoning and problem-solving abilities, often surpassing models with larger parameter counts. The model's architecture retains a similar structure to its predecessor, Phi-3, but includes enhancements such as an extended context length.

Phi-4 supports a 16,000-token context length, allowing it to process and generate extensive long-form content. Its design prioritizes efficiency and robust performance in tasks requiring logical deduction, code generation, and scientific understanding. The model is intended for research and development, serving as a foundational component for generative AI features in various applications, particularly those demanding strong reasoning in resource-constrained or low-latency scenarios.

About Phi-4

The Microsoft Phi-4 model family comprises small language models prioritizing efficient, high-capability reasoning. Its development emphasizes robust data quality and sophisticated synthetic data integration. This approach enables enhanced performance and on-device deployment capabilities.

Other Phi-4 Models

Evaluation Benchmarks

Ranking is for Local LLMs.

Rank

#42

Benchmark	Score	Rank
Graduate-Level QA GPQA	0.56	10
Reasoning LiveBench Reasoning	0.39	19
Professional Knowledge MMLU Pro	0.74	19
General Knowledge MMLU	0.56	20
Mathematics LiveBench Mathematics	0.43	23
Coding LiveBench Coding	0.29	25
Data Analysis LiveBench Data Analysis	0.45	28

Rankings

Overall Rank

#42

Coding Rank

#38

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

16k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Release Notes Read the Paper Download Weights Source Code