趋近智
参数
-
上下文长度
200K
模态
Text
架构
Dense
许可证
Proprietary
发布日期
29 Sept 2025
训练数据截止日期
Jul 2025
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
Claude Sonnet 4.5 Thinking is a frontier-class hybrid reasoning model developed by Anthropic, engineered to provide a sophisticated balance between low-latency execution and high-fidelity cognitive processing. The model architecture introduces a dual-mode inference framework, allowing users to select between a standard response path and an extended thinking mode. In the latter, the model utilizes an internal scratchpad to perform multi-step planning, reflection, and self-correction before generating a final output. This transparent reasoning process is exposed to the user as a visible thought block, facilitating a more explainable and verifiable interaction for complex technical tasks.
Technically, the model is built upon an advanced transformer-based architecture optimized for agentic autonomy and long-horizon execution. It supports a standardized 200,000-token context window, with beta support for up to 1 million tokens, specifically designed to handle massive codebases and extensive document sets. Innovations in parallel tool execution and an improved attention mechanism enable the model to manage complex computer-use tasks, such as navigating file systems, executing shell commands, and coordinating multi-part software projects autonomously for periods exceeding 30 hours.
The system is primarily utilized in high-stakes environments where precision and sustained focus are mandatory. Its design excels in production-level software engineering, rigorous financial analysis, and the orchestration of autonomous agents. By integrating advanced memory management and checkpointing capabilities, the model allows for iterative development workflows where progress can be saved and referenced across long-duration sessions. This makes it a primary choice for developers building persistent AI agents that require both deep technical knowledge and the ability to reason through ambiguous, multi-step instructions.
Enhanced Claude models with further improvements in reasoning, coding, and agentic capabilities. Features advanced thinking modes with adjustable effort levels (high, medium, standard) for optimal performance-latency tradeoffs. Excels at complex analysis, software development, web development, and long-context understanding. Includes thinking variants that expose reasoning process for improved transparency.
排名
#14
| 基准 | 分数 | 排名 |
|---|---|---|
Coding LiveBench Coding | 0.80 | 🥉 3 |
Agentic Coding LiveBench Agentic | 0.53 | 5 |
StackEval ProLLM Stack Eval | 0.97 | 5 |
Web Development WebDev Arena | 1450 | ⭐ 7 |
Reasoning LiveBench Reasoning | 0.78 | 8 |
Data Analysis LiveBench Data Analysis | 0.72 | 15 |
Mathematics LiveBench Mathematics | 0.79 | 17 |