Choose a Model
Pick the right model for your task, budget, and latency requirements from 220 options across 18 providers.
Model Naming
All models use the provider/model format:
anthropic/claude-sonnet-4-6 openai/gpt-4.1 google/gemini-2.5-pro deepseek/deepseek-chat
Bare model names (e.g., gpt-4.1) are accepted as aliases and resolve to the default provider.
By Use Case
General Chat and Assistants
| Model | Strength | Speed | Cost |
|---|---|---|---|
anthropic/claude-sonnet-4-6 | Best overall balance | Fast | Medium |
openai/gpt-4.1 | Strong general purpose | Fast | Medium |
google/gemini-2.5-pro | Long context (1M tokens) | Medium | Medium |
deepseek/deepseek-chat | Cost-effective | Fast | Low |
Complex Reasoning
| Model | Strength | Speed | Cost |
|---|---|---|---|
anthropic/claude-opus-4-6 | Deep analysis, research | Slow | High |
openai/o3 | Math, logic, science | Slow | High |
openai/o4-mini | Reasoning at lower cost | Medium | Medium |
google/gemini-2.5-pro | Long-context reasoning | Medium | Medium |
Fast and Cheap
| Model | Strength | Speed | Cost |
|---|---|---|---|
anthropic/claude-haiku-4-5 | Fastest Claude | Very fast | Low |
openai/gpt-4.1-nano | Cheapest OpenAI | Very fast | Very low |
google/gemini-2.5-flash | Fast with thinking | Very fast | Low |
deepseek/deepseek-chat | High quality for price | Fast | Low |
Coding
| Model | Strength | Speed | Cost |
|---|---|---|---|
anthropic/claude-sonnet-4-6 | Best for code generation | Fast | Medium |
openai/gpt-5-codex | Optimized for code | Medium | Medium |
alibaba/qwen3-coder-next | Strong coding in Chinese+English | Fast | Low |
deepseek/deepseek-chat | Competitive coding | Fast | Low |
Chinese Language
| Model | Strength | Speed | Cost |
|---|---|---|---|
alibaba/qwen-3.6-plus | Best Chinese understanding | Fast | Low |
zhipu/glm-5.1 | Strong Chinese reasoning | Fast | Low |
deepseek/deepseek-chat | Bilingual excellence | Fast | Low |
moonshot/kimi-k2.5 | Chinese conversation | Fast | Low |
Image Generation
| Model | Endpoint | Strength |
|---|---|---|
google/imagen-4.0 | /v1/images/generations | Photorealistic |
azure/mai-image-2 | /v1/images/generations | Creative styles |
alibaba/wan-2.7 | /v1/images/generations | Artistic generation |
Embeddings
| Model | Dimensions | Use Case |
|---|---|---|
openai/text-embedding-3-large | 3072 | Highest accuracy |
openai/text-embedding-3-small | 1536 | Good balance |
google/gemini-embedding-001 | 3072 | Long text |
cohere/embed-v4 | 1024 | Multilingual |
Auto Routing
Set model to auto and Chuizi.AI selects the best model based on your request characteristics — input length, language, task type, and budget:
example.py
python
response = client.chat.completions.create( model="auto", messages=[{"role": "user", "content": "Explain quantum computing"}], )
Cost Comparison
All pricing is upstream cost x 1.05. Check current prices at chuizi.ai/models — the models page shows input/output price per million tokens for every model.
As a rough guide:
| Tier | Input (per 1M tokens) | Models |
|---|---|---|
| Budget | < $0.50 | GPT-4.1-nano, Gemini Flash, Haiku |
| Mid-range | $0.50 - $5 | Sonnet, GPT-4.1, DeepSeek |
| Premium | > $5 | Opus, o3, Gemini Pro |
Full Model List
Browse all available models with live pricing, context window sizes, and capability tags at chuizi.ai/models.
Next Steps
- Migration Guide — switch from your current provider
- Cost Optimization — reduce costs with caching and model selection
- Prompt Caching — save up to 90% on repeated prompts