Anthropic Messages API
Chuizi.AI proxies the Anthropic Messages API as a native passthrough. Your request body goes directly to Anthropic with zero format conversion, and automatic failover ensures high availability. The response comes back in Anthropic's own format, including streaming SSE event types.
This matters if you use Claude Code, Cursor, Cline, or OpenCode. These tools expect the Anthropic Messages API, not OpenAI's format. With Chuizi.AI, you set two environment variables and everything works.
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /anthropic/v1/messages | Create a message (streaming and non-streaming) |
GET | /anthropic/v1/models | List available Anthropic models |
Authentication
Chuizi.AI accepts your ck- API key through two headers. Use whichever your tool expects:
| Header | Format | Example |
|---|---|---|
x-api-key | ck-your-key-here | Anthropic SDK default |
Authorization | Bearer ck-your-key-here | OpenAI convention |
Both resolve to the same user account, balance, and rate limits.
Required Headers
| Header | Value | Notes |
|---|---|---|
anthropic-version | 2023-06-01 | Required. Passed through to upstream. |
content-type | application/json | Required for POST requests. |
How It Differs from OpenAI /v1/chat/completions
If you are used to the OpenAI Chat Completions format, these are the key structural differences in Anthropic's Messages API:
| Concept | OpenAI Chat Completions | Anthropic Messages |
|---|---|---|
| System prompt | messages array entry with role: "system" | Top-level system field (string or array of blocks) |
| Content format | String or array of content parts | Always an array of content blocks ([{type: "text", text: "..."}]) |
| Stop indicator | finish_reason: "stop" | stop_reason: "end_turn" |
| Max tokens | Optional (max_tokens or max_completion_tokens) | Required (max_tokens) |
| Token usage | usage.prompt_tokens, usage.completion_tokens | usage.input_tokens, usage.output_tokens |
| Streaming events | data: {"choices": [...]} chunks | Typed events: message_start, content_block_delta, message_delta |
| Model prefix | anthropic/claude-sonnet-4-6 | claude-sonnet-4-6 (bare name) or anthropic/claude-sonnet-4-6 |
Request Format
{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "system": "You are a helpful assistant.", "messages": [ { "role": "user", "content": "Explain how TCP handshakes work." } ] }
Supported Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model name. Bare names (claude-sonnet-4-6) and prefixed names (anthropic/claude-sonnet-4-6) both work. |
max_tokens | integer | Yes | Maximum tokens to generate. |
messages | array | Yes | Conversation messages. |
system | string or array | No | System prompt. Can be a string or an array of content blocks (useful for caching). |
stream | boolean | No | Enable SSE streaming. Default: false. |
temperature | number | No | Sampling temperature, 0-1. |
top_p | number | No | Nucleus sampling threshold. |
top_k | integer | No | Top-K sampling. |
stop_sequences | array | No | Custom stop sequences. |
tools | array | No | Tool definitions for function calling. |
tool_choice | object | No | Tool selection strategy. |
metadata | object | No | Request metadata (e.g., user_id for abuse tracking). |
Response Format
Non-streaming
{ "id": "msg_01XFDUDYJgAACzvnptvVoYEL", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "TCP uses a three-way handshake to establish a connection..." } ], "model": "claude-sonnet-4-6", "stop_reason": "end_turn", "usage": { "input_tokens": 25, "output_tokens": 150 } }
Streaming
When stream: true, the response uses Anthropic's SSE event types:
event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","model":"claude-sonnet-4-6","usage":{"input_tokens":25,"output_tokens":0}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"TCP uses"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" a three-way"}}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":150}}
event: message_stop
data: {"type":"message_stop"}Prompt Caching
Anthropic supports explicit prompt caching via cache_control blocks. You can mark system prompts, messages, and tool definitions as cacheable. Cached content costs significantly less on subsequent requests (up to 90% savings).
Add cache_control: {"type": "ephemeral"} to any content block you want cached:
{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "system": [ { "type": "text", "text": "You are a senior code reviewer. Here are the project conventions: ...(long text)...", "cache_control": {"type": "ephemeral"} } ], "messages": [ {"role": "user", "content": "Review this pull request."} ] }
You can also cache tool definitions and conversation history:
{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "tools": [ { "name": "search_codebase", "description": "Search the codebase for relevant files", "input_schema": {"type": "object", "properties": {"query": {"type": "string"}}}, "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Here is the full file content: ...(long text)...", "cache_control": {"type": "ephemeral"} } ] } ] }
When caching takes effect, the response usage includes additional fields:
| Field | Description |
|---|---|
cache_creation_input_tokens | Tokens written to cache (charged at 1.25x input price) |
cache_read_input_tokens | Tokens read from cache (charged at 0.1x input price) |
Code Examples
curl -X POST https://api.chuizi.ai/anthropic/v1/messages \ -H "x-api-key: ck-your-key-here" \ -H "anthropic-version: 2023-06-01" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [ {"role": "user", "content": "What is the capital of France?"} ] }'
curl -X POST https://api.chuizi.ai/anthropic/v1/messages \ -H "x-api-key: ck-your-key-here" \ -H "anthropic-version: 2023-06-01" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "stream": true, "messages": [ {"role": "user", "content": "Write a haiku about programming."} ] }'
import anthropic client = anthropic.Anthropic( base_url="https://api.chuizi.ai/anthropic", api_key="ck-your-key-here", ) message = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=[ {"role": "user", "content": "Explain recursion in one paragraph."} ], ) print(message.content[0].text)
import anthropic client = anthropic.Anthropic( base_url="https://api.chuizi.ai/anthropic", api_key="ck-your-key-here", ) with client.messages.stream( model="claude-sonnet-4-6", max_tokens=1024, messages=[ {"role": "user", "content": "Write a short story."} ], ) as stream: for text in stream.text_stream: print(text, end="", flush=True)
import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ baseURL: "https://api.chuizi.ai/anthropic", apiKey: "ck-your-key-here", }); const message = await client.messages.create({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [ { role: "user", content: "What is WebAssembly?" }, ], }); console.log(message.content[0].text);
Claude Code / Cursor / Cline Setup
Claude Code
Add to your shell profile (~/.zshrc or ~/.bashrc):
export ANTHROPIC_BASE_URL=https://api.chuizi.ai/anthropic export ANTHROPIC_API_KEY=ck-your-key-here
Restart your terminal. Claude Code works immediately with no other changes.
Cursor
- Open Cursor Settings.
- Go to Models > Anthropic.
- Set API Base URL to
https://api.chuizi.ai/anthropic. - Set API Key to your
ck-key.
Cline (VS Code Extension)
- Open Cline settings in VS Code.
- Select Anthropic as the provider.
- Set Base URL to
https://api.chuizi.ai/anthropic. - Set API Key to your
ck-key.
Why Native Passthrough Matters
Most API gateways (including OpenRouter) only support the OpenAI Chat Completions format. When you send a request for a Claude model, they convert your OpenAI-format request into Anthropic format behind the scenes, then convert the response back. This translation layer:
- Loses Anthropic-specific features (prompt caching, extended thinking, citation blocks)
- Adds latency from format conversion
- Can introduce subtle bugs in edge cases (tool calling, multi-turn conversations)
Chuizi.AI's /anthropic endpoint does none of this. Your request goes to Anthropic exactly as you wrote it. The response comes back exactly as Anthropic sent it. The only thing Chuizi.AI touches is the authentication key and billing extraction.
Error Format
Errors follow Anthropic's error format:
{ "type": "error", "error": { "type": "invalid_request_error", "message": "max_tokens is required" } }
| Error Type | HTTP Status | Description |
|---|---|---|
invalid_request_error | 400 | Malformed request or missing required fields |
authentication_error | 401 | Invalid or missing API key |
permission_error | 403 | API key does not have access to the requested model |
not_found_error | 404 | Model not found |
rate_limit_error | 429 | Rate limit exceeded |
api_error | 500 | Upstream provider error |
For full error code details, see Error Codes. To learn about prompt caching savings, see Cache Discount Pricing.
Next Steps
- Choosing a Protocol — Compare OpenAI, Anthropic, and Gemini protocols side by side
- Claude Code Integration — Detailed setup guide for Claude Code with Chuizi.AI
- Cache Discount Pricing — Save up to 90% on input costs with Anthropic prompt caching