Parameters
-
Context Length
-
Modality
Multimodal
Architecture
Dense
License
Proprietary
Release Date
19 May 2026
Knowledge Cutoff
-
Attention
Attention Structure
Multi-Head Attention
Attention Heads
-
Key-Value Heads
-
Attention Head Dimension
-
Position Embedding
Absolute Position Embedding
RoPE Theta
-
Sliding Window Attention
-
Sliding Window Size
-
Normalization
-
Activation Function
-
Dimensions
Hidden Dimension Size
-
Number of Layers
-
FFN Intermediate Size (Dense)
-
Multi-Token Prediction Heads
-
Tokenizer
Vocabulary Size
-
The first model in Google's new Omni family, released at Google I/O on May 19, 2026. Gemini Omni Flash is a native video-generation model that accepts any combination of text, images, audio, and video as input and produces high-quality video output grounded in Gemini's real-world knowledge. It enables conversational video editing across multiple turns - maintaining character consistency, physics, and scene continuity, and supports Avatars for personalized video creation. Rolled out to Google AI Plus, Pro, and Ultra subscribers globally through the Gemini app and Google Flow.
The Gemini Omni family is Google's first generation of native video-generation models, combining Gemini's multimodal reasoning with the ability to create from any input. Announced at Google I/O 2026, Omni models accept combinations of text, images, audio, and video, allowing users to generate and conversationally edit high-quality videos grounded in Gemini's real-world knowledge.
No evaluation benchmarks for Gemini Omni Flash available.
Overall Rank
-
Coding Rank
-
APX AI
Online