Troubleshooting FAQ

How do I fix 429 Too Many Requests?

A 429 means you hit the rate limit. Chuizi.AI uses sliding window rate limiting with these defaults:

  • Global — 60 RPM (60 requests per minute)
  • Per model — 30 RPM (30 requests per minute)

How to fix it:

  1. Check the Retry-After header in the response and wait the specified number of seconds
  2. Implement an exponential backoff strategy
  3. If you need higher limits, adjust rpm_limit in the API Key settings in your Dashboard
  4. Spread requests across time windows to avoid bursts

What should I do about request timeouts?

Timeouts typically have a few causes:

  • Slow model processing — large models (o3, Opus) can take significant time on complex inputs. Set a longer timeout value, or use stream: true for incremental output
  • Network issues — check your connectivity to api.chuizi.ai
  • Upstream provider congestion — upstream responses may slow down during peak hours. Chuizi.AI automatically tries backup channels

Recommended settings:

timeout: 120000  // Set 2+ minutes for reasoning models
stream: true     // Streaming gives you the first token faster

Why am I getting empty responses?

Common causes of empty responses:

  1. max_tokens set too low — the model does not have enough space to generate content. Increase max_tokens (at least 1024 recommended)
  2. stop parameter triggers too early — check if your stop sequences conflict with expected output
  3. Content safety filter — the model determined the output might violate safety policies and chose not to generate. Modify your prompt
  4. Try stream: true — streaming mode lets you see if the model produced any partial output

How do I fix 401 Unauthorized?

Troubleshooting steps for authentication failures:

  1. Check API key format — it must start with ck-
  2. Check header format — OpenAI protocol uses Authorization: Bearer ck-xxx, Anthropic protocol uses x-api-key: ck-xxx
  3. Check key status — log in to the Dashboard and verify the key is Active
  4. Check expiration — if an expiry date was set, verify it has not passed
  5. Check for copy errors — watch for extra spaces or newline characters

How do I fix 402 Payment Required?

A 402 means insufficient balance. Possible causes:

  • Balance depleted — log in to the Dashboard to check your current balance, then go to the Billing page to top up
  • Pre-deduction holding balance — in-flight requests are holding part of your balance. Wait for reconciliation to complete and the balance will be released
  • Daily limit reached — the API key has a daily_limit set. Wait until the UTC midnight reset, or increase the limit

How do I fix 403 Forbidden?

A 403 typically means your API key does not have permission for the requested model. Check the following:

  1. allowed_models — in the Dashboard, check the key's allowed models list. If a whitelist is configured, only listed models are accessible
  2. IP whitelist — if the key has an IP whitelist, make sure your request originates from a listed IP
  3. Model exists — verify the model name is spelled correctly. Use GET /v1/models to see all available models

What do 502/503 upstream errors mean?

502 and 503 indicate the upstream provider is temporarily experiencing issues. Chuizi.AI handles these automatically:

  • Automatic failover — if the model has multiple provider channels configured, the gateway switches to a backup channel
  • Circuit breaker — channels with consecutive failures are temporarily disabled to avoid sending more requests to failing nodes

What you should do:

  1. Wait a few seconds and retry (upstream outages typically resolve within minutes)
  2. Check the Chuizi.AI status page for any announcements
  3. Try an alternate provider version of the same model (e.g., openai/gpt-4o instead of openai/gpt-4o)

Why is the model output truncated?

Truncated output is usually caused by the max_tokens parameter limit. Different models have different default max_tokens values, and some are conservative.

How to fix it:

config.json
json
{
  "model": "anthropic/claude-sonnet-4-6",
  "max_tokens": 4096,
  "messages": [...]
}

Note that max_tokens cannot exceed the model's maximum output limit. For example, Claude models support up to 64K output tokens.

What should I do if streaming output is interrupted?

Common causes and solutions for streaming interruptions:

  1. Unstable network — use a more stable connection, or increase the client connection timeout
  2. Proxy/firewall interference — some corporate proxies buffer or terminate SSE connections. Verify your proxy configuration allows long-lived connections
  3. Client not parsing SSE correctly — make sure your code properly handles the text/event-stream format
  4. Implement reconnection — add automatic reconnection logic to your client that detects disconnects and re-sends the request

What if image generation fails?

Common causes of image generation failure:

  1. Prompt triggers safety filter — modify your prompt to remove content that may be flagged
  2. Invalid parameters — check that size, quality, n, and other parameters are within the model's supported range
  3. Unsupported operation — not all image models support inpainting or variations. Verify the endpoint and model match your intended operation
  4. Insufficient balance — image generation is billed per request and individual costs can be higher. Make sure your balance is sufficient

Next Steps

Troubleshooting FAQ — Chuizi AI Docs | Chuizi AI