Streaming

Enabling Streaming

Set "stream": true in your request body:

config.json
json
{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [{"role": "user", "content": "Write a haiku about APIs."}],
  "stream": true,
  "max_tokens": 256
}

The response arrives as a series of Server-Sent Events (SSE) over a single HTTP connection.

SSE Event Format

Each event is a line prefixed with data: , followed by a JSON object. Events are separated by two newlines. The stream ends with data: [DONE].

data: {"id":"gen-xxx","object":"chat.completion.chunk",...}

data: {"id":"gen-xxx","object":"chat.completion.chunk",...}

data: [DONE]

Protocol Differences

OpenAI Protocol (/v1/chat/completions)

This is the default protocol. Each chunk contains a delta object with partial content:

data: {"id":"gen-abc123","object":"chat.completion.chunk","created":1712000000,"model":"openai/gpt-4.1","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"gen-abc123","object":"chat.completion.chunk","created":1712000000,"model":"openai/gpt-4.1","choices":[{"index":0,"delta":{"content":"In"},"finish_reason":null}]}

data: {"id":"gen-abc123","object":"chat.completion.chunk","created":1712000000,"model":"openai/gpt-4.1","choices":[{"index":0,"delta":{"content":" the"},"finish_reason":null}]}

data: {"id":"gen-abc123","object":"chat.completion.chunk","created":1712000000,"model":"openai/gpt-4.1","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":14,"completion_tokens":17,"total_tokens":31}}

data: [DONE]

Anthropic Protocol (/anthropic/v1/messages)

The Anthropic protocol uses typed events with an event: field:

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc123","type":"message","role":"assistant","model":"claude-sonnet-4-6","usage":{"input_tokens":25,"output_tokens":1}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"In"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":17}}

event: message_stop
data: {"type":"message_stop"}

Gemini Protocol (/gemini/v1beta/models/*/streamGenerateContent)

Gemini streaming uses its native format. Each chunk contains partial candidates:

config.json
json
{"candidates":[{"content":{"parts":[{"text":"In"}],"role":"model"}}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":1}}

Code Examples

terminal
bash
curl -X POST https://api.chuizi.ai/v1/chat/completions \
  -H "Authorization: Bearer ck-your-key-here" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Write a haiku about APIs."}],
    "stream": true,
    "stream_options": {"include_usage": true}
  }'

Next Steps