Question 1

How do I fix 429 Too Many Requests?

Accepted Answer

A 429 means you hit the rate limit. Chuizi.AI uses sliding window rate limiting with these defaults:

Global — 60 RPM (60 requests per minute)
Per model — 30 RPM (30 requests per minute)

How to fix it:

Check the Retry-After header in the response and wait the specified number of seconds
Implement an exponential backoff strategy
If you need higher limits, adjust rpm_limit in the API Key settings in your Dashboard
Spread requests across time windows to avoid bursts

Question 2

What should I do about request timeouts?

Accepted Answer

Timeouts typically have a few causes:

Slow model processing — large models (o3, Opus) can take significant time on complex inputs. Set a longer timeout value, or use stream: true for incremental output
Network issues — check your connectivity to api.chuizi.ai
Upstream provider congestion — upstream responses may slow down during peak hours. Chuizi.AI automatically tries backup channels

Recommended settings:

``
timeout: 120000  // Set 2+ minutes for reasoning models
stream: true     // Streaming gives you the first token faster
``

Question 3

Why am I getting empty responses?

Accepted Answer

Common causes of empty responses:

max_tokens set too low — the model does not have enough space to generate content. Increase max_tokens (at least 1024 recommended)
stop parameter triggers too early — check if your stop sequences conflict with expected output
Content safety filter — the model determined the output might violate safety policies and chose not to generate. Modify your prompt
Try stream: true — streaming mode lets you see if the model produced any partial output

Question 4

How do I fix 401 Unauthorized?

Accepted Answer

Troubleshooting steps for authentication failures:

Check API key format — it must start with ck-
Check header format — OpenAI protocol uses Authorization: Bearer ck-xxx, Anthropic protocol uses x-api-key: ck-xxx
Check key status — log in to the Dashboard and verify the key is Active
Check expiration — if an expiry date was set, verify it has not passed
Check for copy errors — watch for extra spaces or newline characters

Question 5

How do I fix 402 Payment Required?

Accepted Answer

A 402 means insufficient balance. Possible causes:

Balance depleted — log in to the Dashboard to check your current balance, then go to the Billing page to top up
Pre-deduction holding balance — in-flight requests are holding part of your balance. Wait for reconciliation to complete and the balance will be released
Daily limit reached — the API key has a daily_limit set. Wait until the UTC midnight reset, or increase the limit

Question 6

How do I fix 403 Forbidden?

Accepted Answer

A 403 typically means your API key does not have permission for the requested model. Check the following:

allowed_models — in the Dashboard, check the key's allowed models list. If a whitelist is configured, only listed models are accessible
IP whitelist — if the key has an IP whitelist, make sure your request originates from a listed IP
Model exists — verify the model name is spelled correctly. Use GET /v1/models to see all available models

Question 7

What do 502/503 upstream errors mean?

Accepted Answer

502 and 503 indicate the upstream provider is temporarily experiencing issues. Chuizi.AI handles these automatically:

Automatic failover — if the model has multiple provider channels configured, the gateway switches to a backup channel
Circuit breaker — channels with consecutive failures are temporarily disabled to avoid sending more requests to failing nodes

What you should do:

Wait a few seconds and retry (upstream outages typically resolve within minutes)
Check the Chuizi.AI status page for any announcements
Try an alternate provider version of the same model (e.g., openai/gpt-4o instead of openai/gpt-4o)

Question 8

Why is the model output truncated?

Accepted Answer

Truncated output is usually caused by the max_tokens parameter limit. Different models have different default max_tokens values, and some are conservative.

How to fix it:

``json
{
  "model": "anthropic/claude-sonnet-4-6",
  "max_tokens": 4096,
  "messages": [...]
}
`

Note that max_tokens` cannot exceed the model's maximum output limit. For example, Claude models support up to 64K output tokens.

Question 9

What should I do if streaming output is interrupted?

Accepted Answer

Common causes and solutions for streaming interruptions:

Unstable network — use a more stable connection, or increase the client connection timeout
Proxy/firewall interference — some corporate proxies buffer or terminate SSE connections. Verify your proxy configuration allows long-lived connections
Client not parsing SSE correctly — make sure your code properly handles the text/event-stream format
Implement reconnection — add automatic reconnection logic to your client that detects disconnects and re-sends the request

Question 10

What if image generation fails?

Accepted Answer

Common causes of image generation failure:

Prompt triggers safety filter — modify your prompt to remove content that may be flagged
Invalid parameters — check that size, quality, n, and other parameters are within the model's supported range
Unsupported operation — not all image models support inpainting or variations. Verify the endpoint and model match your intended operation
Insufficient balance — image generation is billed per request and individual costs can be higher. Make sure your balance is sufficient

Troubleshooting FAQ