Error Handling

Error Response Formats

OpenAI Protocol (`/v1/*`)

All errors follow the OpenAI error format:

config.json

json

{
  "error": {
    "message": "Insufficient balance. Current: $0.12, Required: $0.50",
    "type": "insufficient_quota",
    "code": "402"
  }
}

Anthropic Protocol (`/anthropic/*`)

Errors from the Anthropic protocol use the Anthropic error format:

config.json

json

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key"
  }
}

Error Code Reference

HTTP Status	Error Type	Description	Retryable
400	`invalid_request_error`	Malformed request, missing required fields, invalid parameter values	No
401	`authentication_error`	Invalid or missing API key	No
402	`insufficient_quota`	Account balance too low for the estimated request cost	No (top up first)
403	`permission_error`	API key does not have permission for this model or action	No
404	`not_found`	Model does not exist or is not available	No
429	`rate_limit_error`	Too many requests. Check `Retry-After` header	Yes (after delay)
500	`server_error`	Internal gateway error	Yes
502	`upstream_error`	Upstream provider returned an invalid response	Yes
503	`service_unavailable`	Gateway is overloaded or upstream provider is down	Yes
504	`timeout`	Request timed out waiting for upstream provider	Yes

Common Errors and Solutions

401: Authentication Error

config.json

json

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "401"
  }
}

Causes:

API key is missing from the request.
API key does not start with ck-.
API key has been revoked or deactivated.

Fix: Check that your Authorization: Bearer ck-... header is present and the key is active in the dashboard.

402: Insufficient Balance

config.json

json

{
  "error": {
    "message": "Insufficient balance. Current: $0.12",
    "type": "insufficient_quota",
    "code": "402"
  }
}

Fix: Top up your account in the dashboard under Billing. The gateway pre-deducts estimated cost before sending the request upstream. Ensure your balance covers the estimated cost.

429: Rate Limited

config.json

json

{
  "error": {
    "message": "Rate limit exceeded. Retry after 2 seconds.",
    "type": "rate_limit_error",
    "code": "429"
  }
}

Fix: Wait for the duration specified in the Retry-After response header, then retry. Default rate limit is 60 RPM per API key.

504: Timeout

config.json

json

{
  "error": {
    "message": "Request timed out after 120 seconds",
    "type": "timeout",
    "code": "504"
  }
}

Fix: Large models (Opus 4, GPT-5, o3) with long outputs may take over 60 seconds. Set your client timeout to at least 120 seconds. Consider using streaming to get partial results faster.

Retry Strategy: Exponential Backoff

For retryable errors (429, 500, 502, 503, 504), use exponential backoff with jitter:

example.py

python

import time
import random
from openai import OpenAI, APIError, RateLimitError, APITimeoutError

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)


def chat_with_retry(messages, model="openai/gpt-4.1", max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model=model,
                messages=messages,
                max_tokens=1024,
            )
        except RateLimitError as e:
            # Use Retry-After header if available
            retry_after = getattr(e, "headers", {}).get("retry-after")
            wait = float(retry_after) if retry_after else (2 ** attempt) + random.random()
            print(f"Rate limited. Retrying in {wait:.1f}s...")
            time.sleep(wait)
        except APITimeoutError:
            wait = (2 ** attempt) + random.random()
            print(f"Timeout. Retrying in {wait:.1f}s...")
            time.sleep(wait)
        except APIError as e:
            if e.status_code in (500, 502, 503):
                wait = (2 ** attempt) + random.random()
                print(f"Server error {e.status_code}. Retrying in {wait:.1f}s...")
                time.sleep(wait)
            else:
                raise  # Non-retryable error
    raise Exception(f"Failed after {max_retries} retries")


response = chat_with_retry([{"role": "user", "content": "Hello"}])
print(response.choices[0].message.content)

index.mjs

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.chuizi.ai/v1',
  apiKey: 'ck-your-key-here',
  maxRetries: 0, // We handle retries ourselves
});

async function chatWithRetry(messages, model = 'openai/gpt-4.1', maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.chat.completions.create({
        model,
        messages,
        max_tokens: 1024,
      });
    } catch (error) {
      const status = error.status;
      const retryable = [429, 500, 502, 503, 504].includes(status);

      if (!retryable || attempt === maxRetries - 1) {
        throw error;
      }

      // Use Retry-After header or exponential backoff
      const retryAfter = error.headers?.['retry-after'];
      const wait = retryAfter
        ? parseFloat(retryAfter) * 1000
        : Math.pow(2, attempt) * 1000 + Math.random() * 1000;

      console.log(`Error ${status}. Retrying in ${(wait / 1000).toFixed(1)}s...`);
      await new Promise((resolve) => setTimeout(resolve, wait));
    }
  }
}

const response = await chatWithRetry([{ role: 'user', content: 'Hello' }]);
console.log(response.choices[0].message.content);

Idempotency and Safe Retries

All Chuizi.AI POST endpoints are safe to retry because:

Pre-deduction is idempotent. If a request fails after the balance freeze, the frozen amount is automatically released.
Generation IDs are unique. Each request creates a new gen-xxx ID. Retrying creates a new generation, not a duplicate charge on the same one.
No side effects upstream. Chat completion requests are stateless -- the same request sent twice produces two independent completions.

However, be aware that retried requests do consume tokens and incur charges for each attempt that reaches the upstream provider. The retry strategies above only retry on errors that typically occur before the upstream provider is contacted (rate limits) or after it fails (timeouts, server errors).

Tips

Log the x-chuizi-generation-id header. Every response includes a generation ID. When reporting issues, include this ID for faster debugging.
Use the OpenAI SDK's built-in retry. Both the Python and Node.js OpenAI SDKs have built-in retry logic. Set max_retries when initializing the client.
Set client-side timeouts. The default HTTP timeout in many libraries is 30 seconds, which is too short for large model responses. Set it to 120-300 seconds.
Monitor 429 rates. If you hit rate limits frequently, contact support to increase your per-key RPM limit, or distribute requests across multiple API keys.

Next Steps

Error Codes Reference — complete error code listing with descriptions
Rate Limits — understand and configure rate limiting
Production Best Practices — comprehensive production readiness guide

Error Handling

Error Response Formats

OpenAI Protocol (/v1/*)

Anthropic Protocol (/anthropic/*)

Error Code Reference

Common Errors and Solutions

401: Authentication Error

402: Insufficient Balance

429: Rate Limited

504: Timeout

Retry Strategy: Exponential Backoff

Idempotency and Safe Retries

Tips

Next Steps

OpenAI Protocol (`/v1/*`)

Anthropic Protocol (`/anthropic/*`)