Rate Limits

Rate Limits

Chuizi.AI uses 60-second sliding windows to protect gateway infrastructure and upstream model services.

Limit Layers

LayerDescription
Key-level RPMUses the API key's rpm_limit first, then account or system defaults
User-level RPMAggregates all keys for the same user to prevent bypassing limits with extra keys
Model-level RPMAdds independent protection for a user's traffic to a single model

Response Headers

Responses may include:

HeaderDescription
X-RateLimit-LimitCurrent window limit
X-RateLimit-RemainingRemaining requests in the current window
X-RateLimit-ResetWindow reset time when available

Retry-After is not part of the current stable public contract. Clients should implement exponential backoff with jitter even when that header is absent.

429 Example

config.json
json
{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Recommendations

  • Implement exponential backoff with jitter.
  • Do not use multiple keys to bypass user-level limits.
  • Contact support for higher production limits.
  • Use GET /v1/key/info to inspect key configuration.