Claude Sonnet 4.6 Released, Sometimes Beating Opus 4.5 in Coding
Contents
Anthropic released Claude Sonnet 4.6 on February 17, 2026. It continues the roughly four-month update cadence and improves across coding, computer use, and long-context reasoning.
The Claude model lineup
Looking back at the last six months of Claude releases makes Anthropic’s strategy fairly clear.
| Time | Model | Positioning |
|---|---|---|
| November 2025 | Opus 4.5 | Highest-intelligence model |
| October 2025 | Sonnet 4.5 | Cost-effective flagship |
| February 2026 | Sonnet 4.6 | Successor to Sonnet 4.5, sometimes surpassing Opus |
| February 2026 | Opus 4.6 | Frontier model for deeper reasoning tasks |
The notable part is that Sonnet 4.6 and Opus 4.6 now coexist. If Sonnet 4.6 can outperform Opus 4.5 in some practical coding tasks, it suggests model size alone is no longer a clean proxy for performance.
Coding performance
Early testing in Claude Code produced strong results:
- 70% of users preferred Sonnet 4.6 over Sonnet 4.5
- Even against the larger Opus 4.5 from November 2025, 59% preferred Sonnet 4.6
Users reported less overengineering and better instruction-following. In other words, it is less likely to add unnecessary “improvements” and more likely to do exactly what was asked. Anthropic also says prompt-injection resistance improved versus Sonnet 4.5.
I already run this blog day to day with Claude Code and have written several related posts, including Claude Code Tips, a best-practices collection, and automated development with tmux. Sonnet 4.5 was already usable in practice, but the improvement in “not doing extra things” should matter a lot in real workflows. The more you hand tasks to an agent, the more productivity depends on whether it follows instructions precisely.
Computer use capability
When Anthropic first introduced a general computer-use model in October 2024, the impression was that it was experimental and error-prone. Sixteen months later, the capability has clearly advanced.
- OSWorld benchmark: steady improvements across the Sonnet line
- Insurance workflow benchmark: 94% accuracy in Pace’s evaluation
- OfficeQA: performance on par with Opus 4.6
It can now handle practical tasks such as spreadsheet manipulation and multi-step web form entry, though Anthropic says it still does not match a skilled human user.
Long-context processing
Anthropic is also offering a 1 million token context window in beta. That makes it possible to fit an entire codebase, a long contract, or multiple research papers into a single request.
The Vending-Bench Arena also showed improvement in strategic long-horizon planning. The model reportedly developed a new strategy on its own: invest in capacity early, then pivot toward profitability later.
Pricing and access
| Item | Value |
|---|---|
| Model ID | claude-sonnet-4-6 |
| Input | $3 / million tokens |
| Output | $15 / million tokens |
| Context | 1M tokens (beta) |
Pricing is unchanged from Sonnet 4.5. On Claude.ai, it becomes the default model for Free and Pro plans. Anthropic also expanded Free-tier capabilities to include file creation, connectors, skills, and compaction.
New API features
Alongside the model release, several API features reached general availability:
- Web search and fetch tools with dynamic result filtering
- Code execution
- Memory
- Programmatic tool calling and tool search
- Extended thinking and adaptive thinking
- Context compaction (beta), which compresses long conversations through automatic summarization
The “better than Opus” angle mainly applies to coding and routine office tasks, not deep reasoning, where Opus 4.6 still leads. But if Claude Code is your main use case, the cost-performance ratio looks very strong. At one-fifth the price of Opus, Sonnet 4.6 seems more than sufficient for daily development once you select it in Claude Code settings.