Parameters
-
Context Length
200K
Modality
Multimodal
Architecture
Dense
License
Proprietary
Release Date
16 Apr 2026
Knowledge Cutoff
-
Attention
Attention Structure
Multi-Head Attention
Attention Heads
-
Key-Value Heads
-
Attention Head Dimension
-
Position Embedding
Absolute Position Embedding
RoPE Theta
-
Sliding Window Attention
-
Sliding Window Size
-
Normalization
-
Activation Function
-
Dimensions
Hidden Dimension Size
-
Number of Layers
-
FFN Intermediate Size (Dense)
-
Multi-Token Prediction Heads
-
Tokenizer
Vocabulary Size
-
Anthropic's most capable Claude 4.7 model, offering a notable improvement on Opus 4.6 in advanced software engineering and agentic tasks. Handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs. Features substantially better vision (up to 2,576 pixels on the long edge — 3×+ the prior limit), improved multimodal understanding, and enhanced instruction following. Supports a new xhigh effort level between high and max for finer reasoning control. State-of-the-art on SWE-bench Verified and top performance across coding, finance, and document reasoning benchmarks. Available via API as claude-opus-4-7. Pricing: $5/M input tokens, $25/M output tokens.
Claude 4.7 is Anthropic's latest generation of flagship models, offering notable improvements in advanced software engineering, agentic tasks, and instruction following. The family features substantially better vision (up to 2,576 pixels on the long edge), improved multimodal understanding, and a new xhigh effort level for finer reasoning control.
Rank
#3
| Benchmark | Score | Rank |
|---|---|---|
Coding LiveBench Coding | 0.83 | 🥈 2 |
Web Development WebDev Arena | 1566 | 🥈 2 |
Reasoning LiveBench Reasoning | 0.88 | 🥉 3 |
Mathematics LiveBench Mathematics | 0.93 | 🥉 3 |
Data Analysis LiveBench Data Analysis | 0.78 | 🥉 3 |
Agentic Coding LiveBench Agentic | 0.62 | 8 |
Overall Rank
#3 🥉
Coding Rank
#1 🥇
Total Score
39
/ 100
Claude 4.7 Opus maintains a high degree of transparency regarding its identity and commercial licensing but remains almost entirely opaque concerning its internal architecture, training data, and compute resources. While the model provides detailed API documentation and performance benchmarks, the lack of reproducible evaluation harnesses and the shift toward non-standard sampling parameters limit independent verification of its capabilities.
Architectural Provenance
Anthropic identifies Claude 4.7 Opus as a direct upgrade to the 4.6 version, but provides no technical documentation regarding its underlying architecture beyond its classification as a 'dense' model in marketing materials. There is no disclosure of layer counts, attention mechanisms, or specific architectural modifications that enable its 'adaptive thinking' or improved vision. While the model is described as 'highly autonomous' with 'self-verification' capabilities, the technical implementation of these features remains proprietary and undocumented.
Dataset Composition
Information regarding the training data for Claude 4.7 Opus is extremely limited. Official documentation mentions a knowledge cutoff of January 2026 and states it was trained on a 'diverse' set of data including web, code, and documents. However, there is no public breakdown of dataset proportions, specific sources, or detailed filtering and cleaning methodologies. The use of 'Constitutional AI' is cited for alignment, but the specific datasets used for this process are not disclosed.
Tokenizer Integrity
Anthropic explicitly documents that Claude 4.7 Opus uses an updated tokenizer that is 1.0x to 1.35x less efficient than previous versions (mapping the same text to more tokens). While the tokenizer's behavior is documented via API behavior and token counting tools (e.g., /v1/messages/count_tokens), the full vocabulary and the training methodology for the tokenizer itself are not publicly available for independent audit or local implementation.
Parameter Density
The parameter count for Claude 4.7 Opus is entirely undisclosed. While third-party speculation suggests it is a large-scale dense model, Anthropic provides no official figures for total or active parameters. This lack of transparency makes it impossible to verify efficiency claims or compare its architectural scale to competitors in a meaningful, evidence-based way.
Training Compute
There is zero public information regarding the compute resources used to train Claude 4.7 Opus. Anthropic does not disclose GPU/TPU hours, hardware specifications, training duration, or the carbon footprint associated with the model's development. Claims of 'significant resources' are purely anecdotal and lack any verifiable data.
Benchmark Reproducibility
Anthropic provides scores for several standard benchmarks (SWE-bench Verified: 87.6%, GPQA Diamond: 94.2%) and introduces internal evaluations like 'Terminal-Bench 2.0'. However, the exact prompts, few-shot examples, and full evaluation harnesses required for independent reproduction are not publicly released. While a 232-page system card exists, it focuses more on safety and 'welfare' metrics than on the technical reproducibility of performance benchmarks.
Identity Consistency
The model consistently identifies itself as Claude 4.7 Opus and is transparent about its versioning within its system prompt and API responses. It accurately reflects its capabilities, such as its 1M token context window and high-resolution vision limits, and does not attempt to mimic the identity of models from other providers.
License Clarity
The model is clearly governed by a proprietary license. Anthropic provides detailed Commercial Terms of Service and API usage agreements that explicitly outline permitted uses, data retention policies (e.g., 30-day standard vs. 5-year for opt-in training), and restrictions on derivative works. While not open source, the legal boundaries for enterprise and developer use are well-defined.
Hardware Footprint
As a closed-weights API-only model, there is no official documentation on the hardware required to run the model locally. While Anthropic provides API rate limits and pricing ($5/$25 per M tokens), it offers no guidance on VRAM or compute requirements for the underlying architecture, leaving developers entirely dependent on Anthropic's managed infrastructure.
Versioning Drift
Anthropic uses a versioned API ID (claude-opus-4-7), but the model has already shown significant behavioral drift compared to its predecessor, such as the removal of standard sampling parameters (temperature, top_p) and the introduction of 'adaptive thinking' which changes output patterns. While these changes are noted in migration guides, there is no public changelog for sub-version weight updates or a mechanism to pin specific model snapshots to prevent silent performance degradation.
APX AI
Online