Smart Routing

Core idea

Smart Routing is a user-facing way to choose models, not a hidden internal rule set. Set model to auto/balanced, auto/fast, auto/cheap, auto/quality, auto/code, or auto/vision, and Chuizi.AI will match the request to a suitable model for that task.

If you already know the exact model you want, keep using a concrete model ID such as openai/gpt-4.1, anthropic/claude-sonnet-4-6, or qwen/qwen3.6-plus. Fixed model IDs are not silently replaced by automatic modes.

How to call it

The OpenAI-compatible endpoint stays the same. Change only model:

terminal

bash

curl https://api.chuizi.ai/v1/chat/completions \
  -H "Authorization: Bearer $CHUIZI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto/balanced",
    "messages": [
      { "role": "user", "content": "Review this plan and point out risks." }
    ]
  }'

model	Best for
`auto/balanced`	Normal production requests, balancing quality, speed, and cost
`auto/fast`	Low-latency work such as support, formatting, extraction, and short answers
`auto/cheap`	High-volume low-cost tasks such as classification, summarization, and structured output
`auto/quality`	Harder analysis, deeper reasoning, and long-context work
`auto/code`	Coding, debugging, tool calls, code review, and agent workflows
`auto/vision`	Images, screenshots, document understanding, and visual QA

Legacy aliases such as auto, smart, and claude-auto remain compatible for existing integrations. New projects should prefer clearer names such as auto/balanced or auto/code.

What automatic modes do

For each automatic request, Chuizi.AI:

Checks the current key, wallet, and model access range.
Detects what the request needs, such as text, image input, tool calls, context length, and output budget.
Matches the task to a capable model available to the current account.
Ranks choices according to the selected mode: speed, cost, quality, code, or vision.
Returns the normal response and records model, usage, cost, latency, and status.

If no available model can serve the request, the API returns a clear error instead of calling outside the current key's access range.

Automatic mode or fixed model?

Need	Suggested model value
Evals, compliance, or a customer-specified model	Use an exact model ID such as `openai/gpt-4.1`
Normal production traffic	`auto/balanced`
Support and realtime user experience	`auto/fast`
Batch summaries, classification, structured output	`auto/cheap`
Code generation, debugging, agent workflows	`auto/code`
Images, screenshots, document understanding	`auto/vision`

Fixed models and automatic modes can coexist in the same project: pin the critical paths, and use automatic modes for everyday workloads.

Billing and observability

Automatic modes and fixed models use the same generation records. After a request completes, query the generation record:

terminal

bash

curl "https://api.chuizi.ai/v1/generation?id=gen-xxx" \
  -H "Authorization: Bearer $CHUIZI_API_KEY"

The record shows the final model, usage, cost, latency, status, and request ID. Public APIs return the information needed for debugging and billing review, without exposing sensitive infrastructure details.

Recommendations

Start with auto/balanced, then split into auto/fast, auto/cheap, auto/code, or auto/vision as traffic patterns become clear.
Use exact model IDs for work that must be reproducible, approved, or customer-specified.
Create separate keys for production, testing, support, agent, and media workloads to keep model access and cost review clean.
Use the model catalog to check currently available models, capabilities, and prices.

Next steps

Choose a Model - decide between fixed model IDs and automatic modes
Provider Routing - understand availability protection for a selected model
Generation API - inspect cost, usage, and status for each request