ApX logoApX logo

GPT-5.4 mini

Parameters

-

Context Length

400K

Modality

Multimodal

Architecture

Dense

License

Proprietary

Release Date

17 Mar 2026

Knowledge Cutoff

-

Technical Specifications

Attention

Attention Structure

Multi-Head Attention

Attention Heads

-

Key-Value Heads

-

Attention Head Dimension

-

Position Embedding

Absolute Position Embedding

RoPE Theta

-

Sliding Window Attention

-

Sliding Window Size

-

Normalization

-

Activation Function

-

Dimensions

Hidden Dimension Size

-

Number of Layers

-

FFN Intermediate Size (Dense)

-

Multi-Token Prediction Heads

-

Tokenizer

Vocabulary Size

-

GPT-5.4 mini

GPT-5.4 mini is OpenAI's fast, efficient small model that brings many capabilities of GPT-5.4 to high-volume workloads. It significantly improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use while running more than 2x faster. Approaches GPT-5.4 performance on SWE-Bench Pro (54.4%) and OSWorld-Verified (72.1%). Supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills. 400K context window. Pricing: $0.75/M input, $4.50/M output. Available via API as gpt-5.4-mini. Released March 17, 2026.

About GPT-5.4

GPT-5.4 is OpenAI's most capable and efficient frontier model for professional work, combining the industry-leading coding capabilities of GPT-5.3-Codex with major advances in reasoning, computer use, and agentic workflows. It introduces native computer-use capabilities, tool search for large tool ecosystems, substantially improved knowledge work (spreadsheets, presentations, documents), and is OpenAI's most factual and token-efficient reasoning model. Supports up to 1M context tokens in Codex. Released March 5, 2026.


Other GPT-5.4 Models

Evaluation Benchmarks

Rank

#41

BenchmarkScoreRank

General Text

Text Arena

1457

19

0.75

22

0.72

25

0.79

25

Web Development

WebDev Arena

1399

36

Rankings

Overall Rank

#41

Coding Rank

#46

Model Integrity

Total Score

F

34 / 100

GPT-5.4 mini Model Integrity Report

Total Score

34

/ 100

F

Audit Note

GPT-5.4 mini presents a highly opaque transparency profile, characterized by a total lack of disclosure regarding its internal architecture, parameter count, and training compute. While it provides clear API versioning and consistent self-identification, the model relies almost entirely on unverifiable marketing claims for its performance and efficiency metrics. The absence of a technical paper or data provenance details makes it impossible for third parties to independently audit its safety, bias, or technical integrity.

Upstream

10.0 / 30

Architectural Provenance

3.0 / 10

OpenAI identifies GPT-5.4 mini as a 'distilled' variant of the flagship GPT-5.4 model, but provides no technical documentation regarding the distillation process, architectural modifications, or the specific transformer configuration. While it is described as a 'unified architecture' that merges reasoning and coding capabilities, there is zero public information on layers, attention mechanisms, or pre-training methodology beyond vague marketing descriptions of efficiency.

Dataset Composition

2.0 / 10

OpenAI provides no specific breakdown of the training data for GPT-5.4 mini. Documentation only mentions a knowledge cutoff of August 31, 2025, and claims the model is 'factually grounded' with 33% fewer false claims than GPT-5.2. No information is provided regarding the ratio of web data, code, or books, nor are there details on filtering, cleaning, or the use of synthetic data during the distillation process.

Tokenizer Integrity

5.0 / 10

The model utilizes the 'o200k_base' tokenizer with a vocabulary size of approximately 200,000 tokens. While the tokenizer's behavior can be observed via the API and third-party tools, OpenAI has not released a formal technical paper documenting the specific BPE configuration or training alignment for the 5.4 series. This lack of official documentation for the specific version limits full verification of its multilingual efficiency.

Model

13.0 / 40

Parameter Density

1.0 / 10

The parameter count for GPT-5.4 mini is completely undisclosed. While it is marketed as a 'small' and 'efficient' model that runs 2x faster than its predecessor, OpenAI provides no data on total or active parameters. There is no clarification on whether the architecture is dense or utilizes sparse Mixture-of-Experts (MoE) techniques, making it impossible to verify its density claims.

Training Compute

0.0 / 10

OpenAI provides no information regarding the compute resources used to train or distill GPT-5.4 mini. There are no disclosures on GPU/TPU hours, hardware specifications, training duration, or carbon footprint. The company explicitly omits these details in official announcements, citing competitive reasons.

Benchmark Reproducibility

3.0 / 10

While OpenAI provides specific scores for benchmarks like SWE-Bench Pro (54.4%) and OSWorld-Verified (72.1%), the evaluation code, exact prompts, and few-shot examples are not public. Third-party analysis notes that comparisons are often made against older models (GPT-5.2) rather than the most recent predecessors, and the lack of a technical paper prevents independent verification of the claimed results.

Identity Consistency

9.0 / 10

The model consistently identifies itself as GPT-5.4 mini across API calls and system prompts. It maintains clear versioning (gpt-5.4-mini) and correctly identifies its capabilities, such as its 400K context window and multimodal support. There are no documented instances of the model claiming to be a competitor's product or misrepresenting its identity.

Downstream

11.0 / 30

License Clarity

4.0 / 10

The model is governed by a restrictive proprietary license. While the terms for API usage and commercial pricing ($0.75/M input, $4.50/M output) are clearly stated, the underlying weights and code are not accessible. The license is subject to OpenAI's standard Terms of Service, which can change without notice, and provides no transparency for derivative works or offline use.

Hardware Footprint

2.0 / 10

As a closed-source API model, there is no documentation on the VRAM or hardware requirements for local deployment. OpenAI provides latency and throughput metrics (e.g., ~180-190 tokens/sec), but these are service-side performance figures rather than hardware footprint data. There is no guidance on quantization tradeoffs or memory scaling for the 400K context window.

Versioning Drift

5.0 / 10

OpenAI uses semantic-like versioning and provides 'snapshots' to allow developers to lock in specific model versions for consistency. However, the company has a history of 'silent updates' and performance drift that is not always captured in public changelogs. While version history is accessible via the API, the lack of detailed technical changelogs for architectural or weight updates limits transparency.