Anthropic Unveils Sonnet 4.5 With 30-Hour Focus

Image © Arstechnica

Anthropic unveils Claude Sonnet 4.5, along with Claude Code 2.0 and the Claude Agent SDK, touting longer-running task capability and enhanced coding performance.

September 30, 2025

Anthropic released Claude Sonnet 4.5, its latest language model, described by the company as its most capable to date, with improvements in coding and computer use. The announcement also included Claude Code 2.0, a command-line AI agent for developers, and the Claude Agent SDK, a tool for building custom AI coding agents.

Anthropic says Sonnet 4.5 has demonstrated the ability to work continuously on the same project for more than 30 hours on complex, multi-step tasks, though the company has not disclosed specifics about the tasks. Long-running agentic models have historically struggled with maintaining coherence as time progresses and context memory fills up.

Claude’s family has long been organized by model size: Haiku, Sonnet, and Opus. Haiku was updated in November 2024 (to 3.5), Sonnet in May 2025 (to 4.0), and Opus in August 2025 (to 4.1). Model size roughly tracks contextual depth and problem-solving ability, but larger models come with higher costs and slower performance, so Anthropic positions Sonnet as a balanced option that has served as a mid-size flagship for years.

Anthropic touts Sonnet 4.5 as the best coding model in the world, emphasizing its strength in building complex agents and using computers, with substantial gains in reasoning and math. The company also highlighted benchmark results: SWE-bench Verified at 77.2%, OSWorld at 61.4%, and claimed superiority over OpenAI’s GPT-5 Codex (74.5%) and Google’s Gemini 2.5 Pro (67.2%).

Beyond the model, Anthropic announced Claude Code 2.0 and the Claude Agent SDK, aiming to streamline developer tooling for AI-assisted coding. The update is paired with broader Claude tooling enhancements, including an Imagine with Claude preview for Max subscribers that showcases real-time code generation within Claude’s interface, and a pricing structure for Sonnet 4.5 via the Claude API at $3 per million input tokens and $15 per million output tokens.

As with self-reported benchmarks, independent verification remains important — numbers can be affected by test design or data leakage. Even so, Anthropic frames Claude Sonnet 4.5 as a meaningful advance in coding capability, long-running agent performance, and the suite of tools around Claude for developers.

Arstechnica