Model Selection
Switch between Claude Opus, Sonnet, and Haiku mid-session to balance capability and cost for each task.
Available Models
Through Chuizi.AI's /anthropic endpoint, Claude Code can use these models:
Current Generation
| Model | Input Price | Output Price | Context Window | Strengths |
|---|---|---|---|---|
claude-opus-4-6 | $15 / 1M | $75 / 1M | 200K | Most capable, complex tasks |
claude-sonnet-4-6 | $3 / 1M | $15 / 1M | 200K | Best overall value |
claude-haiku-4-5 | $1 / 1M | $5 / 1M | 200K | Fastest, lightweight tasks |
Previous Generation (Still Available)
| Model | Input Price | Output Price | Context Window |
|---|---|---|---|
claude-sonnet-4 | $3 / 1M | $15 / 1M | 200K |
claude-haiku-4-5 | $0.80 / 1M | $4 / 1M | 200K |
These are Anthropic's upstream prices. Chuizi.AI charges upstream price x 1.05.
Prompt Caching Prices
With prompt caching enabled, repeated input tokens are billed at a lower rate:
| Model | Cache Write | Cache Read | Savings |
|---|---|---|---|
claude-opus-4-6 | $18.75 / 1M | $1.50 / 1M | Up to 90% |
claude-sonnet-4-6 | $3.75 / 1M | $0.30 / 1M | Up to 90% |
claude-haiku-4-5 | $1.25 / 1M | $0.10 / 1M | Up to 90% |
The Chuizi.AI gateway automatically enables caching for eligible requests. No configuration needed on your end.
Default Model
Claude Code defaults to claude-sonnet-4-6. For most coding tasks, this is the right choice:
- Code understanding and generation quality approaches Opus
- Response speed is 2-3x faster than Opus
- Only $15 per 1M output tokens (vs. $75 for Opus)
Switching Models
Using the /model Command
In the Claude Code interactive interface, type the /model command:
> /model claude-opus-4-6
All subsequent messages will use the new model until you switch again.
Common Switching Scenarios
Upgrade to Opus when you need:
> /model claude-opus-4-6
- Complex architectural design decisions
- Hard-to-debug multi-file issues
- Deep reasoning for refactoring tasks
- Critical code reviews involving security or performance
Downgrade to Haiku when you only need:
> /model claude-haiku-4-5
- Simple code formatting
- Basic file operations
- Quick code explanations
- Batch small changes
Switch back to Sonnet for everyday coding:
> /model claude-sonnet-4-6
Selection Strategies
By Task Type
| Task | Recommended Model | Reason |
|---|---|---|
| New feature development | Sonnet 4.6 | Balance of understanding + generation |
| Complex bug fixing | Opus 4.6 | Needs deep cross-file reasoning |
| Code review | Sonnet 4.6 | Quality is sufficient, cost is reasonable |
| Architecture design | Opus 4.6 | Requires global perspective and deep analysis |
| Simple edits | Haiku 4.5 | Speed first, save costs |
| Documentation writing | Sonnet 4.6 | Good language quality, cost effective |
| Test case generation | Haiku 4.5 | Patterned task, Haiku handles it well |
By Budget
If you're cost-conscious, here's a practical allocation:
- 80% of time on Sonnet -- covers the majority of daily development
- 15% of time on Haiku -- handles simple, repetitive tasks
- 5% of time on Opus -- only when deep reasoning is truly needed
With this split, your weighted average cost is roughly $13 per 1M output tokens.
Using Other Provider Models via /v1
If you need OpenAI or Google models, you can access them through Chuizi.AI's OpenAI-compatible endpoint at /v1. However, Claude Code natively uses the Anthropic protocol and doesn't support /v1. Use other tools like Cursor or Cline to access non-Claude models through Chuizi.AI.
Next Steps
- Configuration - Set up environment variables
- Troubleshooting - Solutions for common issues
- Prompt Caching Guide - Deep dive into the caching mechanism