趋近智
参数
-
上下文长度
200K
模态
Text
架构
Dense
许可证
Proprietary
发布日期
15 Jan 2025
训练数据截止日期
Jan 2025
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
Claude 4 Sonnet is a production-oriented large language model that implements a hybrid reasoning framework, designed to optimize the trade-off between execution speed and logical depth. The model's architecture facilitates two distinct processing states: a standard mode for near-instantaneous response generation and an extended thinking mode that utilizes a configurable token budget for internal, step-by-step chain-of-thought processing. This dual-state capability allows for more sophisticated problem-solving in complex domains like software engineering and mathematics, where the model can systematically verify its logic before committing to a final output.
Technically, the model integrates advanced attention mechanisms and rotary positional encodings to support an expansive context window, enabling the processing of high-density information such as entire software repositories or legal corpora. The architecture is built on a dense transformer foundation, utilizing multi-head attention (MHA) and absolute position embeddings to maintain high precision across its operational range. Developers can programmatically control the model's reasoning intensity through specialized API parameters, effectively tuning the latent computational effort allocated to specific requests.
Optimized for reliability in agentic workflows, Claude 4 Sonnet features enhanced instruction-following and improved memory persistence, which reduces context degradation during long-horizon tasks. Its multimodal capabilities allow for the simultaneous processing of text and image inputs, supporting use cases from automated visual inspection to complex document analysis. The model is deployed as a proprietary foundation model, ensuring consistent performance and security standards suitable for enterprise-grade applications and high-throughput production environments.
Anthropic's fourth generation Claude models with advanced reasoning, extended context windows up to 200K tokens, and configurable thinking effort levels. Features improved safety alignment, nuanced understanding, and sophisticated task completion. Includes Opus (most capable), Sonnet (balanced), and Haiku (fast) variants, with thinking modes that enable transparent chain-of-thought reasoning for complex problems.
排名
#72
| 基准 | 分数 | 排名 |
|---|---|---|
StackEval ProLLM Stack Eval | 0.98 | 🥈 2 |
QA Assistant ProLLM QA Assistant | 0.96 | 🥉 3 |
Graduate-Level QA GPQA | 0.8 | 17 |
Agentic Coding LiveBench Agentic | 0.38 | 22 |
Reasoning LiveBench Reasoning | 0.40 | 36 |
Data Analysis LiveBench Data Analysis | 0.65 | 38 |
Mathematics LiveBench Mathematics | 0.60 | 39 |