Smart Routing
Core idea
Smart Routing is a user-facing way to choose models, not a hidden internal rule set. Set model to auto/balanced, auto/fast, auto/cheap, auto/quality, auto/code, or auto/vision, and Chuizi.AI will match the request to a suitable model for that task.
If you already know the exact model you want, keep using a concrete model ID such as openai/gpt-4.1, anthropic/claude-sonnet-4-6, or qwen/qwen3.6-plus. Fixed model IDs are not silently replaced by automatic modes.
How to call it
The OpenAI-compatible endpoint stays the same. Change only model:
curl https://api.chuizi.ai/v1/chat/completions \ -H "Authorization: Bearer $CHUIZI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "auto/balanced", "messages": [ { "role": "user", "content": "Review this plan and point out risks." } ] }'
| model | Best for |
|---|---|
auto/balanced | Normal production requests, balancing quality, speed, and cost |
auto/fast | Low-latency work such as support, formatting, extraction, and short answers |
auto/cheap | High-volume low-cost tasks such as classification, summarization, and structured output |
auto/quality | Harder analysis, deeper reasoning, and long-context work |
auto/code | Coding, debugging, tool calls, code review, and agent workflows |
auto/vision | Images, screenshots, document understanding, and visual QA |
Legacy aliases such as auto, smart, and claude-auto remain compatible for existing integrations. New projects should prefer clearer names such as auto/balanced or auto/code.
What automatic modes do
For each automatic request, Chuizi.AI:
- Checks the current key, wallet, and model access range.
- Detects what the request needs, such as text, image input, tool calls, context length, and output budget.
- Matches the task to a capable model available to the current account.
- Ranks choices according to the selected mode: speed, cost, quality, code, or vision.
- Returns the normal response and records model, usage, cost, latency, and status.
If no available model can serve the request, the API returns a clear error instead of calling outside the current key's access range.
Automatic mode or fixed model?
| Need | Suggested model value |
|---|---|
| Evals, compliance, or a customer-specified model | Use an exact model ID such as openai/gpt-4.1 |
| Normal production traffic | auto/balanced |
| Support and realtime user experience | auto/fast |
| Batch summaries, classification, structured output | auto/cheap |
| Code generation, debugging, agent workflows | auto/code |
| Images, screenshots, document understanding | auto/vision |
Fixed models and automatic modes can coexist in the same project: pin the critical paths, and use automatic modes for everyday workloads.
Billing and observability
Automatic modes and fixed models use the same generation records. After a request completes, query the generation record:
curl "https://api.chuizi.ai/v1/generation?id=gen-xxx" \ -H "Authorization: Bearer $CHUIZI_API_KEY"
The record shows the final model, usage, cost, latency, status, and request ID. Public APIs return the information needed for debugging and billing review, without exposing sensitive infrastructure details.
Recommendations
- Start with
auto/balanced, then split intoauto/fast,auto/cheap,auto/code, orauto/visionas traffic patterns become clear. - Use exact model IDs for work that must be reproducible, approved, or customer-specified.
- Create separate keys for production, testing, support, agent, and media workloads to keep model access and cost review clean.
- Use the model catalog to check currently available models, capabilities, and prices.
Next steps
- Choose a Model - decide between fixed model IDs and automatic modes
- Provider Routing - understand availability protection for a selected model
- Generation API - inspect cost, usage, and status for each request