About Sonnet 4.5 and Sonnet 4.6
Sonnet 4.5 is widely considered one of the most balanced models in the world for professional engineering. Built specifically to handle “long-horizon” tasks, it can work autonomously for 30+ hours on a single coding objective without losing coherence. It was the first model to achieve a 61.4% score on the OSWorld benchmark for real-world computer use, and its 1M token context window combined with a GPQA Diamond score of 83.4% makes it exceptional for research agents and full-stack engineering. Sonnet 4.6 (released February 17, 2026) is the current default model on Claude.ai. It delivers Opus-tier performance for “economically valuable office tasks” at Sonnet-tier cost — with significant advances in computer-use capabilities (OSWorld benchmark), instruction-following consistency, and up to 90% cost savings on high-volume tasks through prompt caching and batch processing. Its GPQA Diamond score reaches 84.4%.Key Capabilities
Computer Use (Native)
Sees screens, moves cursors, and types in standard desktop applications — enabling true browser-based and GUI automation without external tools.
30-Hour Autonomous Coding
Sonnet 4.5 can manage multi-day engineering sprints with self-correction and testing, maintaining coherence across the full session.
Advanced Instruction Following
Sonnet 4.6 delivers significant improvements in consistency and nuance, making it a reliable choice for agent-in-the-loop systems.
Cost-Optimized at Scale
Sonnet 4.6’s prompt caching and batch processing discounts enable up to 90% cost savings versus standard API calls for high-volume workloads.
Best For
Choose Sonnet 4.5 for autonomous software engineering tasks, complex multi-app research workflows, and legal or financial forensics where massive document sets need sustained coherent analysis. Choose Sonnet 4.6 for enterprise-grade agents, full-stack development lifecycle management, and browser-based automation where Opus-level accuracy is needed at Sonnet cost. For tasks with lower complexity requirements, Haiku 4.5 offers faster throughput. For the most demanding reasoning challenges, Opus 4.6 sets the ceiling.Use Cases
- Autonomous software engineering — Building, testing, and deploying full-stack features from a single prompt.
- Complex multi-app workflows — Researching data in a browser and then populating a local Excel sheet and PowerPoint.
- Legal and financial forensics — Analyzing massive document sets for subtle logical contradictions.
- Enterprise workflow automation — Analyzing financial data, synthesizing internal insights, and generating professional content.
- Browser-based agents — Automating procurement, competitive analysis, and customer onboarding via digital interaction.
Specifications
| Specification | Sonnet 4.5 | Sonnet 4.6 |
|---|---|---|
| Provider | Anthropic | Anthropic |
| Context Window | 1.0M tokens | 1.0M tokens |
| Reasoning | Adaptive (Standard/High) | Adaptive (Standard/High) |
| GPQA Diamond | 83.4% | 84.4% |
| Latency (TTFT) | 0.42s | 0.42s |
| Throughput | 38 tokens/sec | 38 tokens/sec |
| Key use cases | Full-stack engineering, research agents, system design | Enterprise agents, UI automation, professional coding |