Rate Limits
Rate Limits
Chuizi.AI uses 60-second sliding windows to protect gateway infrastructure and upstream model services.
Limit Layers
| Layer | Description |
|---|---|
| Key-level RPM | Uses the API key's rpm_limit first, then account or system defaults |
| User-level RPM | Aggregates all keys for the same user to prevent bypassing limits with extra keys |
| Model-level RPM | Adds independent protection for a user's traffic to a single model |
Response Headers
Responses may include:
| Header | Description |
|---|---|
X-RateLimit-Limit | Current window limit |
X-RateLimit-Remaining | Remaining requests in the current window |
X-RateLimit-Reset | Window reset time when available |
Retry-After is not part of the current stable public contract. Clients should implement exponential backoff with jitter even when that header is absent.
429 Example
config.json
json
{ "error": { "message": "Rate limit exceeded", "type": "rate_limit_error", "code": "rate_limit_exceeded" } }
Recommendations
- Implement exponential backoff with jitter.
- Do not use multiple keys to bypass user-level limits.
- Contact support for higher production limits.
- Use
GET /v1/key/infoto inspect key configuration.