Responses-Compatible Models

Responses-Compatible Models

Chuizi.AI's public OpenAI-compatible entrypoint is currently /v1/chat/completions. Even when an upstream model is served through a response-style provider API internally, clients should still send a standard Chat Completions request with model and messages.

Do not send input or instructions directly to /v1/chat/completions; requests without messages are rejected by the gateway.

Request

http
http
POST https://api.chuizi.ai/v1/chat/completions
Authorization: Bearer ck-your-key
Content-Type: application/json
ParameterRequiredDescription
modelYesModel ID shown in the model catalog, for example openai/gpt-4.1
messagesYesOpenAI Chat Completions message array
streamNoSet true for SSE streaming
max_tokens / max_completion_tokensNoMaximum output tokens
temperatureNoSampling temperature, 0-2

Example

terminal
bash
curl -X POST https://api.chuizi.ai/v1/chat/completions \
  -H "Authorization: Bearer ck-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "Write a Python Fibonacci function."}
    ],
    "max_tokens": 1024
  }'

Response

The response keeps the OpenAI Chat Completions shape. Use x_chuizi.generation_id to query billing details later.

config.json
json
{
  "id": "gen-xxxxxxxxxxxxxxxxxxxxxxxx",
  "object": "chat.completion",
  "model": "openai/gpt-4.1",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 95,
    "total_tokens": 123
  },
  "x_chuizi": {
    "generation_id": "gen-xxxxxxxxxxxxxxxxxxxxxxxx",
    "latency_ms": 3500,
    "cost": "0.00180000"
  }
}
Responses-Compatible Models — Chuizi AI Docs | Chuizi AI