Claude Opus, Sonnet, and Haiku: Which Model Should You Use and When

More Than Good, Better, Best

Anthropic releases Claude in three tiers: Haiku, Sonnet, and Opus. The instinct is to treat these as a quality ladder, where Opus is always the right answer if you can afford it. That’s wrong. Each model has a different speed-cost-capability profile, and matching the model to the task is how you get good results without burning through your budget.

The number after the name is the generation. Claude 3, Claude 3.5, Claude 4 — higher numbers are newer releases. Within a generation, the tier name tells you where it sits on the speed and capability spectrum.

Haiku: Fast and Cheap by Design

Haiku is the fastest and least expensive model in the Claude lineup. It’s built for high-volume tasks where response latency matters and the task doesn’t require deep reasoning.

Good use cases: customer-facing chatbots that need to respond quickly, simple classification tasks, summarizing short documents, extracting structured data from text, routing queries to the right place. Haiku handles all of this well and does it cheaper and faster than Sonnet or Opus.

Where it falls short: complex multi-step reasoning, long documents that need careful synthesis, situations where you need the model to catch subtle errors or think through tradeoffs. Don’t use Haiku for your most demanding tasks just because it’s cheaper. It genuinely isn’t as capable on hard problems, and you’ll notice.

Sonnet: The Daily Driver

Sonnet is the balanced model. Good capability, reasonable speed, reasonable cost. For most tasks, this is the right starting point.

If you’re building an app with the Claude API, start with Sonnet. If you’re using Claude for writing, analysis, or coding work, Sonnet handles it well. It’s also the model Claude Code defaults to for most operations, which is a reasonable default.

The practical test for whether to use Sonnet: if the task matters and you need a thoughtful response but it’s not an extremely complex reasoning problem, Sonnet is the call.

Opus: When It’s Actually Worth It

Opus is the most capable, slowest, and most expensive model. It’s designed for the tasks where capability is the constraint and cost isn’t.

Use it for genuinely hard problems: complex architecture decisions, multi-step reasoning over long documents, situations where you need the model to catch subtle logical errors, anything where you’ve tried Sonnet and the output isn’t good enough.

That last part is key. The right way to use Opus is as an escalation path, not a default. Try Sonnet first. If Sonnet isn’t cutting it on a specific task, move to Opus. Don’t default to Opus for everything just because it’s the most capable. You’ll pay more and wait longer for results that Sonnet would have handled fine.

The Claude 4 Generation

Claude 4 is the current generation, and it’s a meaningful step forward from Claude 3.5. The most notable improvements are in instruction following, code quality, and handling of longer and more complex context. If you tried Claude at some point and weren’t impressed, the current generation is worth revisiting.

Extended thinking is a feature available on newer Claude models, particularly for harder tasks. When enabled, Claude works through the problem internally before giving you a response. This uses more tokens but produces better results on problems that benefit from step-by-step reasoning. Think of it as the model slowing down to think before answering. It’s not always worth the cost, but on hard problems it often is.

Context window improvements are also significant. The current generation handles long documents and large codebases much better than earlier versions. This matters a lot for Claude Code workflows and for applications that need to process substantial amounts of text.

How to Choose in Practice

For building applications with the API: start with Sonnet. Measure whether the output quality is acceptable for your use case. Only switch to Opus for specific tasks where Sonnet’s output falls short. Use Haiku for anything where latency or cost is critical and the task is simple enough.

For Claude Code: the defaults are generally right. Claude Code uses Sonnet for most operations and escalates when needed. You can configure it to use specific models, but unless you have a specific reason to change, the defaults are a reasonable starting point.

For the Claude.ai chat interface: Sonnet is the default for good reason. Switch to Opus for genuinely hard questions where you want the most capable answer and you don’t mind waiting a bit longer.

The underlying principle is simple. Model quality costs money and time. Match the model to the task. Use Haiku for fast simple things, Sonnet for most things, Opus for the hard things where quality is the constraint.

If You’re Building With Claude

Understanding the model tiers is foundational, but it’s just the start of building well with Claude. If you want to go deeper on the full development workflow, prompting patterns, and how to use Claude Code effectively, check out the

Claude Code Complete Guide

. It covers the full picture from setup through advanced workflows.