趋近智
参数
-
上下文长度
1,048.576K
模态
Multimodal
架构
Dense
许可证
Proprietary
发布日期
5 Jun 2025
训练数据截止日期
Jan 2025
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Gemini 2.5 Flash Max Thinking is a high-efficiency reasoning model developed by Google, designed to bridge the gap between low-latency inference and complex logical deduction. Built upon a sparse mixture-of-experts (MoE) architecture, this model variant utilizes a dynamic routing mechanism that activates only a subset of its total parameters for each input token. This architectural choice allows the model to maintain the rapid response times characteristic of the Flash family while supporting a maximum thinking budget that facilitates extended chains of reasoning for difficult mathematical and coding tasks.
Technically, the model integrates a specialized 'thinking' phase where it generates internal reasoning tokens before producing a final output. This process is governed by a controllable thinking budget parameter, which developers can tune to balance computational cost and output quality. The model is natively multimodal, capable of processing interleaved sequences of text, images, audio, and video within a massive context window. Its underlying transformer blocks incorporate advanced training stability techniques and signal propagation optimizations, ensuring consistent performance across diverse input modalities and long-context dependencies.
The Max Thinking variant is particularly suited for agentic workflows where intermediate reasoning steps must be transparent or where the task complexity exceeds the capabilities of standard fast-inference models. By allowing the model to allocate more cognitive cycles to a problem, it effectively scales its reasoning capability at runtime. Use cases include sophisticated codebase analysis, complex data extraction from long-form documents, and multi-step scientific problem solving, all while remaining more cost-effective than the larger Pro-tier models in the Gemini 2.5 ecosystem.
Google's advanced multimodal models with native understanding of text, images, audio, and video. Features massive context windows up to 2.1M tokens, max thinking modes for complex reasoning, and optimized variants for different performance/cost tradeoffs. Includes Pro, Flash, and Flash Lite variants with configurable thinking capabilities for transparent reasoning.
排名
#76
| 基准 | 分数 | 排名 |
|---|---|---|
Reasoning LiveBench Reasoning | 0.45 | 27 |
Mathematics LiveBench Mathematics | 0.69 | 27 |
Agentic Coding LiveBench Agentic | 0.17 | 34 |
Coding LiveBench Coding | 0.66 | 36 |