趋近智
参数
-
上下文长度
1,048.576K
模态
Multimodal
架构
Dense
许可证
Proprietary
发布日期
25 Sept 2025
训练数据截止日期
Jan 2025
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Gemini 2.5 Flash Lite Max Thinking is a high-throughput, multimodal reasoning model engineered by Google DeepMind to deliver advanced cognitive capabilities at a significantly reduced computational footprint. As a specialized variant in the Gemini 2.5 family, it integrates a sophisticated 'thinking' mode that allows the model to perform multi-pass reasoning and internal planning before generating a final response. This architectural design enables the system to handle complex logic, such as mathematical problem-solving and multi-step code generation, while maintaining the low-latency profile characteristic of the Flash Lite series.
The model is built upon a sparse Mixture-of-Experts (MoE) architecture, which optimizes resource utilization by routing tokens through specific expert pathways rather than activating the entire parameter set for every request. This structural efficiency is paired with a massive 1-million-token context window, permitting the ingestion of extensive datasets, complete codebases, or long-form video content without the need for complex chunking or retrieval-augmented generation (RAG) strategies. The model natively supports multiple modalities, including text, image, audio, and video, processing these disparate inputs within a unified transformer framework.
From a deployment perspective, the model offers a flexible 'thinking budget' parameter, allowing developers to dynamically scale the amount of reasoning effort based on specific application requirements. This makes it particularly effective for high-volume production environments where a balance between reasoning transparency and cost-efficiency is paramount. Its primary use cases include automated classification at scale, real-time multilingual translation, and the development of agentic workflows that require consistent instruction-following and concise, accurate outputs.
Google's advanced multimodal models with native understanding of text, images, audio, and video. Features massive context windows up to 2.1M tokens, max thinking modes for complex reasoning, and optimized variants for different performance/cost tradeoffs. Includes Pro, Flash, and Flash Lite variants with configurable thinking capabilities for transparent reasoning.
排名
#89
| 基准 | 分数 | 排名 |
|---|---|---|
Data Analysis LiveBench Data Analysis | 0.68 | 28 |
Mathematics LiveBench Mathematics | 0.65 | 34 |
Coding LiveBench Coding | 0.65 | 38 |
Reasoning LiveBench Reasoning | 0.36 | 39 |
Agentic Coding LiveBench Agentic | 0.02 | 42 |