Responses-Compatible Models

Chuizi.AI's public OpenAI-compatible entrypoint is currently /v1/chat/completions. Even when an upstream model is served through a response-style provider API internally, clients should still send a standard Chat Completions request with model and messages.

Do not send input or instructions directly to /v1/chat/completions; requests without messages are rejected by the gateway.

Request

http

POST https://api.chuizi.ai/v1/chat/completions
Authorization: Bearer ck-your-key
Content-Type: application/json

Parameter	Required	Description
`model`	Yes	Model ID shown in the model catalog, for example `openai/gpt-4.1`
`messages`	Yes	OpenAI Chat Completions message array
`stream`	No	Set `true` for SSE streaming
`max_tokens` / `max_completion_tokens`	No	Maximum output tokens
`temperature`	No	Sampling temperature, 0-2

Example

terminal

bash

curl -X POST https://api.chuizi.ai/v1/chat/completions \
  -H "Authorization: Bearer ck-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "Write a Python Fibonacci function."}
    ],
    "max_tokens": 1024
  }'

Response

The response keeps the OpenAI Chat Completions shape. Use x_chuizi.generation_id to query billing details later.

config.json

json

{
  "id": "gen-xxxxxxxxxxxxxxxxxxxxxxxx",
  "object": "chat.completion",
  "model": "openai/gpt-4.1",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 95,
    "total_tokens": 123
  },
  "x_chuizi": {
    "generation_id": "gen-xxxxxxxxxxxxxxxxxxxxxxxx",
    "latency_ms": 3500,
    "cost": "0.00180000"
  }
}

Responses-Compatible Models