Troubleshooting FAQ
How do I fix 429 Too Many Requests?
A 429 means you hit the rate limit. Chuizi.AI uses sliding window rate limiting with these defaults:
- Global — 60 RPM (60 requests per minute)
- Per model — 30 RPM (30 requests per minute)
How to fix it:
- Check the
Retry-Afterheader in the response and wait the specified number of seconds - Implement an exponential backoff strategy
- If you need higher limits, adjust
rpm_limitin the API Key settings in your Dashboard - Spread requests across time windows to avoid bursts
What should I do about request timeouts?
Timeouts typically have a few causes:
- Slow model processing — large models (o3, Opus) can take significant time on complex inputs. Set a longer
timeoutvalue, or usestream: truefor incremental output - Network issues — check your connectivity to
api.chuizi.ai - Upstream provider congestion — upstream responses may slow down during peak hours. Chuizi.AI automatically tries backup channels
Recommended settings:
timeout: 120000 // Set 2+ minutes for reasoning models stream: true // Streaming gives you the first token faster
Why am I getting empty responses?
Common causes of empty responses:
max_tokensset too low — the model does not have enough space to generate content. Increasemax_tokens(at least 1024 recommended)stopparameter triggers too early — check if your stop sequences conflict with expected output- Content safety filter — the model determined the output might violate safety policies and chose not to generate. Modify your prompt
- Try
stream: true— streaming mode lets you see if the model produced any partial output
How do I fix 401 Unauthorized?
Troubleshooting steps for authentication failures:
- Check API key format — it must start with
ck- - Check header format — OpenAI protocol uses
Authorization: Bearer ck-xxx, Anthropic protocol usesx-api-key: ck-xxx - Check key status — log in to the Dashboard and verify the key is Active
- Check expiration — if an expiry date was set, verify it has not passed
- Check for copy errors — watch for extra spaces or newline characters
How do I fix 402 Payment Required?
A 402 means insufficient balance. Possible causes:
- Balance depleted — log in to the Dashboard to check your current balance, then go to the Billing page to top up
- Pre-deduction holding balance — in-flight requests are holding part of your balance. Wait for reconciliation to complete and the balance will be released
- Daily limit reached — the API key has a
daily_limitset. Wait until the UTC midnight reset, or increase the limit
How do I fix 403 Forbidden?
A 403 typically means your API key does not have permission for the requested model. Check the following:
- allowed_models — in the Dashboard, check the key's allowed models list. If a whitelist is configured, only listed models are accessible
- IP whitelist — if the key has an IP whitelist, make sure your request originates from a listed IP
- Model exists — verify the model name is spelled correctly. Use
GET /v1/modelsto see all available models
What do 502/503 upstream errors mean?
502 and 503 indicate the upstream provider is temporarily experiencing issues. Chuizi.AI handles these automatically:
- Automatic failover — if the model has multiple provider channels configured, the gateway switches to a backup channel
- Circuit breaker — channels with consecutive failures are temporarily disabled to avoid sending more requests to failing nodes
What you should do:
- Wait a few seconds and retry (upstream outages typically resolve within minutes)
- Check the Chuizi.AI status page for any announcements
- Try an alternate provider version of the same model (e.g.,
openai/gpt-4oinstead ofopenai/gpt-4o)
Why is the model output truncated?
Truncated output is usually caused by the max_tokens parameter limit. Different models have different default max_tokens values, and some are conservative.
How to fix it:
{ "model": "anthropic/claude-sonnet-4-6", "max_tokens": 4096, "messages": [...] }
Note that max_tokens cannot exceed the model's maximum output limit. For example, Claude models support up to 64K output tokens.
What should I do if streaming output is interrupted?
Common causes and solutions for streaming interruptions:
- Unstable network — use a more stable connection, or increase the client connection timeout
- Proxy/firewall interference — some corporate proxies buffer or terminate SSE connections. Verify your proxy configuration allows long-lived connections
- Client not parsing SSE correctly — make sure your code properly handles the
text/event-streamformat - Implement reconnection — add automatic reconnection logic to your client that detects disconnects and re-sends the request
What if image generation fails?
Common causes of image generation failure:
- Prompt triggers safety filter — modify your prompt to remove content that may be flagged
- Invalid parameters — check that
size,quality,n, and other parameters are within the model's supported range - Unsupported operation — not all image models support inpainting or variations. Verify the endpoint and model match your intended operation
- Insufficient balance — image generation is billed per request and individual costs can be higher. Make sure your balance is sufficient
Next Steps
- Error Codes — Full error code reference with detailed troubleshooting
- Rate Limits — Understand the three-tier rate limiting system
- Error Handling Guide — Implement retries and backoff in your code