Haiku 4.5 — Anthropic's fastest high-volume AI model

About Haiku 4.5

Released in late 2025, Claude Haiku 4.5 brings enhanced computer use capabilities to the lightweight tier — making it practical not just for chat and classification, but also for simple navigation and form-filling in web browsers and desktop environments. It achieves a 73.2% GPQA Diamond score and a 95% success rate on complex JSON schemas, making it reliable for structured tool-calling pipelines at high volume.

Key Capabilities

Instant Tool Calling

Optimized for high-speed function execution with a 95% success rate on complex JSON schemas — reliable enough for production API pipelines.

Computer Use (Light)

Handles simple navigation and form-filling within web browsers and desktop environments without requiring a heavier model.

Massive Throughput

Sustains 180+ tokens/sec across enterprise workloads that require millions of requests per hour without performance degradation.

Advanced Content Moderation

Context-aware safety filters reduce false positives by 40% compared to previous versions, making moderation pipelines more precise.

Best For

Haiku 4.5 is the right choice when speed and cost efficiency matter more than maximum reasoning depth. It is ideal as the execution layer in multi-agent systems — acting as the fast “hands and eyes” for a larger orchestrating model like Opus or Sonnet. For tasks that require deeper reasoning, complex document analysis, or autonomous multi-step workflows, Sonnet 4.5 or 4.6 is the appropriate step up.

In multi-agent architectures, consider using Haiku 4.5 to handle repetitive, low-level sub-tasks (data labeling, form extraction, moderation checks) while reserving Sonnet or Opus for planning, summarization, and complex reasoning steps. This pattern can reduce overall costs by 60–80%.

Use Cases

Customer support chatbots — Near-instant, high-quality responses for global user bases at scale.

Large-scale data labeling — Classifying millions of records with high accuracy for research and model training pipelines.

Sub-agent swarms — Acting as the fast execution layer for larger orchestrating models handling repetitive, low-level tasks.

Real-time content moderation — Context-aware filtering for user-generated content platforms requiring high throughput.

Specification	Value
Provider	Anthropic
Context Window	200K tokens
Reasoning	Adaptive (Standard/High)
GPQA Diamond	73.2%
Latency (TTFT)	0.20s
Throughput	180+ tokens/sec
Key use cases	High-volume support, coding sub-agents, real-time data

Specification

Value

Provider

Anthropic

Context Window

200K tokens

Reasoning

Adaptive (Standard/High)

GPQA Diamond

73.2%

Latency (TTFT)

0.20s

Throughput

180+ tokens/sec

Key use cases

High-volume support, coding sub-agents, real-time data

Model Guide

OpenAI

Anthropic

Google & Others

Haiku 4.5 — Anthropic's fastest high-volume AI model

About Haiku 4.5

Key Capabilities

Instant Tool Calling

Computer Use (Light)

Massive Throughput

Advanced Content Moderation

Best For

Use Cases

Specifications

Model Guide

OpenAI

Anthropic

Google & Others

Documentation Index

​About Haiku 4.5

​Key Capabilities

Instant Tool Calling

Computer Use (Light)

Massive Throughput

Advanced Content Moderation

​Best For

​Use Cases

​Specifications

About Haiku 4.5

Key Capabilities

Best For

Use Cases

Specifications