Rate Limits

Chuizi.AI uses 60-second sliding windows to protect gateway infrastructure and upstream model services.

Limit Layers

Layer	Description
Key-level RPM	Uses the API key's `rpm_limit` first, then account or system defaults
User-level RPM	Aggregates all keys for the same user to prevent bypassing limits with extra keys
Model-level RPM	Adds independent protection for a user's traffic to a single model

Response Headers

Responses may include:

Header	Description
`X-RateLimit-Limit`	Current window limit
`X-RateLimit-Remaining`	Remaining requests in the current window
`X-RateLimit-Reset`	Window reset time when available

Retry-After is not part of the current stable public contract. Clients should implement exponential backoff with jitter even when that header is absent.

429 Example

config.json

json

{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Recommendations

Implement exponential backoff with jitter.
Do not use multiple keys to bypass user-level limits.
Contact support for higher production limits.
Use GET /v1/key/info to inspect key configuration.