Claude Haiku 4.5 Thinking

Closed Source

Closed Weights

Parameters

Context Length

200K

Modality

Text

Architecture

Dense

License

Proprietary

Release Date

1 Oct 2025

Knowledge Cutoff

Feb 2025

Evaluation Benchmarks

Rank

#65

Benchmark	Score	Rank
Data Analysis LiveBench Data Analysis	0.59	23
Mathematics LiveBench Mathematics	0.78	26
Agentic Coding LiveBench Agentic	0.42	29
Coding LiveBench Coding	0.73	30
Reasoning LiveBench Reasoning	0.62	35
Professional Knowledge MMLU Pro	0.79	43

Rankings

Overall Rank

#65

Coding Rank

#64

About Claude Haiku 4.5 Thinking

Claude Haiku 4.5 Thinking is a high-efficiency large language model developed by Anthropic, engineered to provide near-frontier intelligence with the low-latency profile characteristic of the Haiku model family. As a hybrid reasoning model, it incorporates an optional extended thinking mode that allows the system to engage in multi-step internal reasoning before emitting a final response. This architectural design balances the computational demands of complex problem-solving with the speed required for real-time production environments.

Technically, the model features significant advancements in context management and output capacity, supporting a 200,000-token context window and generating up to 64,000 tokens in a single response. It is designed with explicit context awareness, enabling the system to track its own token usage and adjust its reasoning persistence based on the remaining context budget. This capability is specifically tuned to mitigate agentic laziness and improve performance in long-horizon tasks such as codebase refactoring and autonomous system orchestration.

The model's utility is centered on high-throughput, cost-sensitive applications including customer service automation, real-time pair programming, and multi-agent systems where parallel execution is required. By integrating native vision capabilities alongside text processing, it supports multimodal workflows like document analysis and UI testing. Its training emphasis on coding and tool use allows it to act as a responsive engine for developer tools, offering a refined balance of speed and analytical depth at a significantly lower operational cost than flagship variants.

Technical Specifications

Attention

Attention Structure

Multi-Head Attention

Attention Heads

Key-Value Heads

Attention Head Dimension

Position Embedding

Absolute Position Embedding

RoPE Theta

Sliding Window Attention

Sliding Window Size

Sliding Window Ratio

Linear Attention

Linear Attention Ratio

Normalization

RMS Normalization

Activation Function

Dimensions

Hidden Dimension Size

Number of Layers

FFN Intermediate Size (Dense)

Multi-Token Prediction Heads

Tokenizer

Vocabulary Size

Model Integrity

Total Score

33 / 100

Upstream

10.0 / 30

Model

14.0 / 40

Downstream

9.0 / 30

Claude Haiku 4.5 Thinking Model Integrity Report

Total Score

/ 100

Audit Note

Claude Haiku 4.5 exhibits a transparency profile typical of proprietary frontier models, characterized by strong identity consistency and clear API documentation but severe opacity regarding its internal mechanics. Critical gaps exist in the disclosure of training compute, parameter density, and specific dataset composition, making independent verification of its efficiency and provenance impossible. While its functional capabilities are well-documented for end-users, the underlying technical 'black box' remains largely inaccessible to the research community.

Upstream

10.0 / 30

Architectural Provenance

3.0 / 10

Anthropic identifies Claude Haiku 4.5 as a 'hybrid reasoning model' with an 'extended thinking' mode, but provides no technical details regarding the underlying architecture beyond its classification as a transformer-based model. While the system card mentions 'RMS Normalization' and 'Absolute Position Embedding,' it lacks a comprehensive architectural paper or specific details on how the reasoning mechanism is integrated into the dense architecture. The training methodology is described in vague terms such as 'substantial posttraining and finetuning' using RLHF and RLAIF without disclosing specific algorithms or architectural modifications.

Dataset Composition

2.0 / 10

The training data is described as a 'proprietary mix' of public internet data (up to February 2025), non-public third-party data, and synthetic data generated internally. No specific breakdown of dataset proportions (e.g., percentage of code vs. web data) is provided. While the system card mentions the use of a general-purpose web crawler and adherence to robots.txt, it does not disclose specific sources, dataset sizes, or detailed filtering criteria, relying on vague terms like 'due diligence' and 'industry-standard practices.'

Tokenizer Integrity

5.0 / 10

The model uses a tokenizer that is accessible via the Anthropic API and supported by third-party libraries like tiktoken (cl100k_base or similar Claude-specific variants), but official documentation for the specific 4.5 family tokenizer's vocabulary size and training alignment is not explicitly detailed in the system card. While token usage is trackable via API responses, the technical specifications of the tokenizer itself remain largely opaque compared to open-source alternatives.

Model

14.0 / 40

Parameter Density

1.0 / 10

Anthropic does not disclose the parameter count for Claude Haiku 4.5. Third-party speculation suggests a size around 21 billion parameters, but this is unverified. There is no official information regarding active vs. total parameters, nor a breakdown of the model's internal density (attention vs. FFN layers). The lack of disclosure is a significant transparency gap for a model marketed on its efficiency.

Training Compute

0.0 / 10

There is zero public information regarding the compute resources used to train Claude Haiku 4.5. No GPU/TPU hours, hardware specifications, training duration, or carbon footprint data have been disclosed by Anthropic. The company explicitly omits these details from its system cards and technical documentation.

Benchmark Reproducibility

4.0 / 10

Anthropic provides high-level benchmark results (e.g., 73.3% on SWE-bench Verified) and mentions some methodology in footnotes, such as the use of a 'simple scaffold' and specific thinking budgets (128K). However, the exact evaluation code, full prompt sets, and few-shot examples required for precise third-party reproduction are not publicly available. Results are primarily self-reported with limited independent verification of the exact configurations used.

Identity Consistency

9.0 / 10

The model consistently identifies itself as Claude and correctly references its versioning (20251001) in API responses. It demonstrates high awareness of its capabilities, including its 'extended thinking' mode and context window limits. There are no documented instances of the model claiming to be a competitor's product or misrepresenting its nature as an AI.

Downstream

9.0 / 30

License Clarity

2.0 / 10

The model is released under a strictly proprietary license. While the commercial terms for API usage are clearly defined in Anthropic's standard terms of service, there is no public access to the model weights or source code. The 'openness' of the model is restricted to API-based interaction, with no transparency regarding derivative works or redistribution of the underlying technology.

Hardware Footprint

2.0 / 10

As a closed-weights API-only model, there is no documentation on the VRAM or hardware requirements for local deployment. While Anthropic provides information on context window scaling (200K tokens) and output limits (64K tokens), this is only relevant for API consumption. No information is provided regarding quantization trade-offs or the actual computational cost of running the model on inference hardware.

Versioning Drift

5.0 / 10

Anthropic uses date-based versioning (e.g., claude-haiku-4-5-20251001) and maintains a basic changelog for its developer platform. However, the company has a history of retiring models (e.g., Haiku 3.5) with relatively short notice, and there is limited public documentation regarding internal model updates that may cause behavioral drift between major version releases.

Resources

Official Documentation Release Notes

About Claude 4.5

Enhanced Claude models with further improvements in reasoning, coding, and agentic capabilities. Features advanced thinking modes with adjustable effort levels (high, medium, standard) for optimal performance-latency tradeoffs. Excels at complex analysis, software development, web development, and long-context understanding. Includes thinking variants that expose reasoning process for improved transparency.

Claude Haiku 4.5 Thinking

Evaluation Benchmarks

Rankings

About Claude Haiku 4.5 Thinking

Technical Specifications

Model Integrity

Claude Haiku 4.5 Thinking Model Integrity Report

Audit Note

Upstream

Model

Downstream

Resources

About Claude 4.5

Other Claude 4.5 Models