ApX logoApX logo

Phi-4 Reasoning Plus

Parameters

14B

Context Length

32.768K

Modality

Text

Architecture

Dense

License

MIT

Release Date

30 Apr 2025

Knowledge Cutoff

Mar 2025

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

5120

Number of Layers

40

Attention Heads

40

Key-Value Heads

10

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

Phi-4 Reasoning Plus

Phi-4 Reasoning Plus is a 14-billion parameter language model engineered by Microsoft to provide advanced chain-of-thought processing and high-precision logical inference. As an enhanced variant in the Phi-4 family, it is designed to handle sophisticated problem-solving across domains such as mathematics, scientific inquiry, and complex code generation. The model produces structured outputs that include an explicit reasoning trace followed by a final solution, facilitating transparency in its decision-making process. This design prioritizes output quality and depth for tasks where thoroughness is more critical than immediate response speed.

Technically, the model utilizes a dense, decoder-only Transformer architecture with multi-head attention (MHA). It incorporates Rotary Position Embeddings (RoPE) and an expanded context window of 32,768 tokens, allowing it to maintain coherence over the lengthy sequences often required for multi-step reasoning. The training methodology represents a significant advancement in data-centric AI, employing supervised fine-tuning (SFT) on over 1.4 million chain-of-thought traces, followed by reinforcement learning using the Group Relative Policy Optimization (GRPO) algorithm. This RL phase specifically targets verifiable mathematical and logical problems, refining the model's ability to self-correct and explore alternative solutions.

Operational characteristics of Phi-4 Reasoning Plus include a notable increase in token generation compared to the standard Phi-4 models, as the 'plus' variant typically produces 50% more tokens to provide more exhaustive explanations. While this results in higher latency, it enables the model to rival the performance of much larger systems in specialized benchmarks. The model is released under the MIT license with open weights, making it accessible for deployment on consumer-grade hardware and local environments where computational resources are constrained but high-fidelity reasoning is required.

About Phi-4

The Microsoft Phi-4 model family comprises small language models prioritizing efficient, high-capability reasoning. Its development emphasizes robust data quality and sophisticated synthetic data integration. This approach enables enhanced performance and on-device deployment capabilities.


Other Phi-4 Models

Evaluation Benchmarks

Rank

#81

BenchmarkScoreRank

Professional Knowledge

MMLU Pro

0.76

16

Rankings

Overall Rank

#81

Coding Rank

-

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
16k
32k

VRAM Required:

Recommended GPUs