Choose a Model

Pick the right model for your task, budget, and latency requirements from 220 options across 18 providers.

Model Naming

All models use the provider/model format:

anthropic/claude-sonnet-4-6
openai/gpt-4.1
google/gemini-2.5-pro
deepseek/deepseek-chat

Bare model names (e.g., gpt-4.1) are accepted as aliases and resolve to the default provider.

By Use Case

General Chat and Assistants

Model	Strength	Speed	Cost
`anthropic/claude-sonnet-4-6`	Best overall balance	Fast	Medium
`openai/gpt-4.1`	Strong general purpose	Fast	Medium
`google/gemini-2.5-pro`	Long context (1M tokens)	Medium	Medium
`deepseek/deepseek-chat`	Cost-effective	Fast	Low

Complex Reasoning

Model	Strength	Speed	Cost
`anthropic/claude-opus-4-6`	Deep analysis, research	Slow	High
`openai/o3`	Math, logic, science	Slow	High
`openai/o4-mini`	Reasoning at lower cost	Medium	Medium
`google/gemini-2.5-pro`	Long-context reasoning	Medium	Medium

Fast and Cheap

Model	Strength	Speed	Cost
`anthropic/claude-haiku-4-5`	Fastest Claude	Very fast	Low
`openai/gpt-4.1-nano`	Cheapest OpenAI	Very fast	Very low
`google/gemini-2.5-flash`	Fast with thinking	Very fast	Low
`deepseek/deepseek-chat`	High quality for price	Fast	Low

Coding

Model	Strength	Speed	Cost
`anthropic/claude-sonnet-4-6`	Best for code generation	Fast	Medium
`openai/gpt-5-codex`	Optimized for code	Medium	Medium
`alibaba/qwen3-coder-next`	Strong coding in Chinese+English	Fast	Low
`deepseek/deepseek-chat`	Competitive coding	Fast	Low

Chinese Language

Model	Strength	Speed	Cost
`alibaba/qwen-3.6-plus`	Best Chinese understanding	Fast	Low
`zhipu/glm-5.1`	Strong Chinese reasoning	Fast	Low
`deepseek/deepseek-chat`	Bilingual excellence	Fast	Low
`moonshot/kimi-k2.5`	Chinese conversation	Fast	Low

Image Generation

Model	Endpoint	Strength
`google/imagen-4.0`	`/v1/images/generations`	Photorealistic
`azure/mai-image-2`	`/v1/images/generations`	Creative styles
`alibaba/wan-2.7`	`/v1/images/generations`	Artistic generation

Embeddings

Model	Dimensions	Use Case
`openai/text-embedding-3-large`	3072	Highest accuracy
`openai/text-embedding-3-small`	1536	Good balance
`google/gemini-embedding-001`	3072	Long text
`cohere/embed-v4`	1024	Multilingual

Auto Routing

Set model to auto and Chuizi.AI selects the best model based on your request characteristics — input length, language, task type, and budget:

example.py

python

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
)

Cost Comparison

All pricing is upstream cost x 1.05. Check current prices at chuizi.ai/models — the models page shows input/output price per million tokens for every model.

As a rough guide:

Tier	Input (per 1M tokens)	Models
Budget	< $0.50	GPT-4.1-nano, Gemini Flash, Haiku
Mid-range	$0.50 - $5	Sonnet, GPT-4.1, DeepSeek
Premium	> $5	Opus, o3, Gemini Pro

Full Model List

Browse all available models with live pricing, context window sizes, and capability tags at chuizi.ai/models.

Next Steps

Migration Guide — switch from your current provider
Cost Optimization — reduce costs with caching and model selection
Prompt Caching — save up to 90% on repeated prompts