Smart Routing

Core idea

Smart Routing is a user-facing way to choose models, not a hidden internal rule set. Set model to auto/balanced, auto/fast, auto/cheap, auto/quality, auto/code, or auto/vision, and Chuizi.AI will match the request to a suitable model for that task.

If you already know the exact model you want, keep using a concrete model ID such as openai/gpt-4.1, anthropic/claude-sonnet-4-6, or qwen/qwen3.6-plus. Fixed model IDs are not silently replaced by automatic modes.

How to call it

The OpenAI-compatible endpoint stays the same. Change only model:

terminal
bash
curl https://api.chuizi.ai/v1/chat/completions \
  -H "Authorization: Bearer $CHUIZI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto/balanced",
    "messages": [
      { "role": "user", "content": "Review this plan and point out risks." }
    ]
  }'
modelBest for
auto/balancedNormal production requests, balancing quality, speed, and cost
auto/fastLow-latency work such as support, formatting, extraction, and short answers
auto/cheapHigh-volume low-cost tasks such as classification, summarization, and structured output
auto/qualityHarder analysis, deeper reasoning, and long-context work
auto/codeCoding, debugging, tool calls, code review, and agent workflows
auto/visionImages, screenshots, document understanding, and visual QA

Legacy aliases such as auto, smart, and claude-auto remain compatible for existing integrations. New projects should prefer clearer names such as auto/balanced or auto/code.

What automatic modes do

For each automatic request, Chuizi.AI:

  1. Checks the current key, wallet, and model access range.
  2. Detects what the request needs, such as text, image input, tool calls, context length, and output budget.
  3. Matches the task to a capable model available to the current account.
  4. Ranks choices according to the selected mode: speed, cost, quality, code, or vision.
  5. Returns the normal response and records model, usage, cost, latency, and status.

If no available model can serve the request, the API returns a clear error instead of calling outside the current key's access range.

Automatic mode or fixed model?

NeedSuggested model value
Evals, compliance, or a customer-specified modelUse an exact model ID such as openai/gpt-4.1
Normal production trafficauto/balanced
Support and realtime user experienceauto/fast
Batch summaries, classification, structured outputauto/cheap
Code generation, debugging, agent workflowsauto/code
Images, screenshots, document understandingauto/vision

Fixed models and automatic modes can coexist in the same project: pin the critical paths, and use automatic modes for everyday workloads.

Billing and observability

Automatic modes and fixed models use the same generation records. After a request completes, query the generation record:

terminal
bash
curl "https://api.chuizi.ai/v1/generation?id=gen-xxx" \
  -H "Authorization: Bearer $CHUIZI_API_KEY"

The record shows the final model, usage, cost, latency, status, and request ID. Public APIs return the information needed for debugging and billing review, without exposing sensitive infrastructure details.

Recommendations

  • Start with auto/balanced, then split into auto/fast, auto/cheap, auto/code, or auto/vision as traffic patterns become clear.
  • Use exact model IDs for work that must be reproducible, approved, or customer-specified.
  • Create separate keys for production, testing, support, agent, and media workloads to keep model access and cost review clean.
  • Use the model catalog to check currently available models, capabilities, and prices.

Next steps