The Sonnet family from Anthropic strikes the most practical balance between intelligence and throughput for professional software engineering and enterprise agentic workflows. Sonnet 4.5 pioneered long-horizon autonomous coding and native computer use; Sonnet 4.6 extends those capabilities with Opus-class performance at Sonnet-class cost, making it the current default for demanding production deployments.Documentation Index
Fetch the complete documentation index at: https://documentation.deepmask.io/llms.txt
Use this file to discover all available pages before exploring further.
About Sonnet 4.5 and Sonnet 4.6
Sonnet 4.5 is widely considered one of the most balanced models in the world for professional engineering. Built specifically to handle “long-horizon” tasks, it can work autonomously for 30+ hours on a single coding objective without losing coherence. It was the first model to achieve a 61.4% score on the OSWorld benchmark for real-world computer use, and its 1M token context window combined with a GPQA Diamond score of 83.4% makes it exceptional for research agents and full-stack engineering. Sonnet 4.6 (released February 17, 2026) is the current default model on Claude.ai. It delivers Opus-tier performance for “economically valuable office tasks” at Sonnet-tier cost — with significant advances in computer-use capabilities (OSWorld benchmark), instruction-following consistency, and up to 90% cost savings on high-volume tasks through prompt caching and batch processing. Its GPQA Diamond score reaches 84.4%.Key Capabilities
Computer Use (Native)
Sees screens, moves cursors, and types in standard desktop applications — enabling true browser-based and GUI automation without external tools.
30-Hour Autonomous Coding
Sonnet 4.5 can manage multi-day engineering sprints with self-correction and testing, maintaining coherence across the full session.
Advanced Instruction Following
Sonnet 4.6 delivers significant improvements in consistency and nuance, making it a reliable choice for agent-in-the-loop systems.
Cost-Optimized at Scale
Sonnet 4.6’s prompt caching and batch processing discounts enable up to 90% cost savings versus standard API calls for high-volume workloads.
Best For
Choose Sonnet 4.5 for autonomous software engineering tasks, complex multi-app research workflows, and legal or financial forensics where massive document sets need sustained coherent analysis. Choose Sonnet 4.6 for enterprise-grade agents, full-stack development lifecycle management, and browser-based automation where Opus-level accuracy is needed at Sonnet cost. For tasks with lower complexity requirements, Haiku 4.5 offers faster throughput. For the most demanding reasoning challenges, Opus 4.6 sets the ceiling.Use Cases
- Autonomous software engineering — Building, testing, and deploying full-stack features from a single prompt.
- Complex multi-app workflows — Researching data in a browser and then populating a local Excel sheet and PowerPoint.
- Legal and financial forensics — Analyzing massive document sets for subtle logical contradictions.
- Enterprise workflow automation — Analyzing financial data, synthesizing internal insights, and generating professional content.
- Browser-based agents — Automating procurement, competitive analysis, and customer onboarding via digital interaction.
Specifications
| Specification | Sonnet 4.5 | Sonnet 4.6 |
|---|---|---|
| Provider | Anthropic | Anthropic |
| Context Window | 1.0M tokens | 1.0M tokens |
| Reasoning | Adaptive (Standard/High) | Adaptive (Standard/High) |
| GPQA Diamond | 83.4% | 84.4% |
| Latency (TTFT) | 0.42s | 0.42s |
| Throughput | 38 tokens/sec | 38 tokens/sec |
| Key use cases | Full-stack engineering, research agents, system design | Enterprise agents, UI automation, professional coding |