Claude Sonnet 4.5: Anthropic's Most Advanced Coding and Reasoning Model

Claude Sonnet 4.5 represents Anthropic's most capable coding model to date, establishing new benchmarks in software development, complex agent building, and computer interaction. The model demonstrates exceptional performance in reasoning and mathematical tasks.

Technical Excellence and Performance

The model achieves state-of-the-art results on the SWE-bench Verified evaluation, which assesses real-world software coding capabilities. In practical applications, it maintains focus for over 30 hours on intricate, multi-step tasks. On OSWorld, a benchmark for real-world computer tasks, Sonnet 4.5 leads with 61.4% performance, a significant improvement from the previous 42.2%.

Product Enhancements and Features

Anthropic has released major upgrades alongside the model:

Claude Code improvements: New checkpoint feature enabling progress saving and instant rollback to previous states, refreshed terminal interface, and a native VS Code extension
API enhancements: Context editing features and memory tools allowing agents to operate longer with greater complexity
Application updates: Direct code execution and file creation (spreadsheets, slides, documents) within conversations
Chrome extension: Now accessible to Max users from the waitlist

Claude Agent SDK Introduction

Developers now have access to the same infrastructure that powers Claude Code through the Claude Agent SDK. This toolkit provides the foundation used by Anthropic to create their frontier products, enabling developers to build sophisticated agents for various tasks beyond coding.

Safety and Alignment Improvements

Claude Sonnet 4.5 stands as Anthropic's most aligned frontier model, showing substantial improvements in reducing problematic behaviors including sycophancy, deception, power-seeking, and encouragement of delusional thinking. The model demonstrates enhanced resistance to prompt injection attacks, particularly important for agentic and computer use capabilities.

The release operates under Anthropic's AI Safety Level 3 (ASL-3) protections, implementing appropriate safeguards matched to model capabilities. These include classifiers designed to detect potentially dangerous inputs and outputs, particularly those related to CBRN (chemical, biological, radiological, and nuclear) weapons.

Industry Reception and Applications

Early adopters report significant improvements across various domains:

Software Development: Organizations like Cursor and GitHub Copilot report enhanced performance in complex, codebase-spanning tasks
Security: Hai security agents achieved 44% reduction in vulnerability intake time with 25% accuracy improvement
Legal: State-of-the-art performance on complex litigation tasks, including full briefing cycle analysis
Finance: Delivery of investment-grade insights requiring less human review for risk analysis and portfolio screening
Design: Improved functionality in tools like Figma Make for creating more functional prototypes

Benchmark Performance

The model shows improved capabilities across various evaluations:

Leading performance on coding benchmarks
Enhanced reasoning and mathematical capabilities
Superior domain-specific knowledge in finance, law, medicine, and STEM fields compared to previous models including Opus 4.1

Research Preview: Imagine with Claude

Anthropic has launched a temporary research preview called "Imagine with Claude," demonstrating the model's ability to generate software dynamically. This experiment showcases real-time creation and adaptation without predetermined functionality or prewritten code, available to Max subscribers for a limited period.

Availability and Pricing

Claude Sonnet 4.5 is immediately available across all platforms. Developers can access it through the Claude API using claude-sonnet-4-5. Pricing remains consistent with Claude Sonnet 4 at $3/$15 per million tokens. Anthropic recommends upgrading to Claude Sonnet 4.5 for all applications, as it provides superior performance at the same price point as a direct replacement option.

View source Back to news