Parameters
-
Context Length
128K
Modality
Multimodal
Architecture
Dense
License
Proprietary
Release Date
13 May 2024
Knowledge Cutoff
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
-
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
GPT-4o is OpenAI's flagship omni-modal model combining text, vision, and audio in a unified architecture. Features real-time responsiveness with superior performance across diverse tasks including reasoning, coding, multilingual understanding, and creative writing. Offers 128K context window with efficient token usage. Represents significant advancement in multimodal AI with seamless integration of different modalities for natural human-computer interaction.
GPT-4o ("o" for "omni") is OpenAI's flagship multimodal model combining text, vision, and audio understanding in a unified architecture. Features real-time responsiveness, superior multilingual capabilities, and enhanced reasoning. Represents the evolution of the GPT-4 series with improved efficiency and broader modality support.
Rank
#38
| Benchmark | Score | Rank |
|---|---|---|
Refactoring Aider Refactoring | 0.63 | 🥇 1 |
General Knowledge MMLU | 0.89 | 4 |
QA Assistant ProLLM QA Assistant | 0.96 | 5 |
StackEval ProLLM Stack Eval | 0.96 | 7 |
Graduate-Level QA GPQA | 0.84 | 14 |
Summarization ProLLM Summarization | 0.75 | 15 |
Coding Aider Coding | 0.45 | 23 |
Professional Knowledge MMLU Pro | 0.74 | 42 |
Overall Rank
#38
Coding Rank
#70