Parameters
14B
Context Length
32.768K
Modality
Text
Architecture
Dense
License
MIT
Release Date
30 Apr 2025
Knowledge Cutoff
Mar 2025
Attention Structure
Multi-Head Attention
Hidden Dimension Size
5120
Number of Layers
40
Attention Heads
40
Key-Value Heads
10
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
Phi-4 Reasoning Plus is a 14-billion parameter language model engineered by Microsoft to provide advanced chain-of-thought processing and high-precision logical inference. As an enhanced variant in the Phi-4 family, it is designed to handle sophisticated problem-solving across domains such as mathematics, scientific inquiry, and complex code generation. The model produces structured outputs that include an explicit reasoning trace followed by a final solution, facilitating transparency in its decision-making process. This design prioritizes output quality and depth for tasks where thoroughness is more critical than immediate response speed.
Technically, the model utilizes a dense, decoder-only Transformer architecture with multi-head attention (MHA). It incorporates Rotary Position Embeddings (RoPE) and an expanded context window of 32,768 tokens, allowing it to maintain coherence over the lengthy sequences often required for multi-step reasoning. The training methodology represents a significant advancement in data-centric AI, employing supervised fine-tuning (SFT) on over 1.4 million chain-of-thought traces, followed by reinforcement learning using the Group Relative Policy Optimization (GRPO) algorithm. This RL phase specifically targets verifiable mathematical and logical problems, refining the model's ability to self-correct and explore alternative solutions.
Operational characteristics of Phi-4 Reasoning Plus include a notable increase in token generation compared to the standard Phi-4 models, as the 'plus' variant typically produces 50% more tokens to provide more exhaustive explanations. While this results in higher latency, it enables the model to rival the performance of much larger systems in specialized benchmarks. The model is released under the MIT license with open weights, making it accessible for deployment on consumer-grade hardware and local environments where computational resources are constrained but high-fidelity reasoning is required.
The Microsoft Phi-4 model family comprises small language models prioritizing efficient, high-capability reasoning. Its development emphasizes robust data quality and sophisticated synthetic data integration. This approach enables enhanced performance and on-device deployment capabilities.
Rank
#81
| Benchmark | Score | Rank |
|---|---|---|
Professional Knowledge MMLU Pro | 0.76 | 16 |
Overall Rank
#81
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens