Model Pricing
Prices below are upstream provider costs. Your actual bill is the displayed price x 1.05. Token prices are per 1 million tokens.
For the full model list and real-time pricing, visit chuizi.ai/models.
OpenAI
| Model | Input ($/1M) | Output ($/1M) | Context Window |
|---|---|---|---|
| GPT-4.1 | $2.00 | $8.00 | 1M |
| GPT-4.1-mini | $0.40 | $1.60 | 1M |
| GPT-4.1-nano | $0.10 | $0.40 | 1M |
| GPT-4o | $2.50 | $10.00 | 128K |
| GPT-4o-mini | $0.15 | $0.60 | 128K |
| o3 | $2.00 | $8.00 | 200K |
| o4-mini | $1.10 | $4.40 | 200K |
Anthropic
| Model | Input ($/1M) | Output ($/1M) | Context Window |
|---|---|---|---|
| Claude Opus 4-6 | $15.00 | $75.00 | 200K |
| Claude Sonnet 4-6 | $3.00 | $15.00 | 200K |
| Claude Haiku 4-5 | $1.00 | $5.00 | 200K |
Anthropic models support prompt caching. The cache_read price is approximately 10% of the input price. See Cache Discount Pricing for details.
| Model | Input ($/1M) | Output ($/1M) | Context Window |
|---|---|---|---|
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M |
DeepSeek
| Model | Input ($/1M) | Output ($/1M) | Context Window |
|---|---|---|---|
| DeepSeek V3.2 | $0.28 | $0.42 | 128K |
| DeepSeek R1 | $0.55 | $2.19 | 128K |
| DeepSeek Chat | $0.28 | $0.42 | 128K |
DeepSeek models automatically enable disk caching. cache_read saves approximately 90% of the input cost.
Next Steps
- Model Directory — Browse all 200+ models with live pricing and capabilities
- Billing Model — How per-token, per-request, and per-second billing work
- Cache Discount Pricing — Save up to 90% with prompt caching
- Cost Optimization — Four strategies to cut costs by 50-90%