Anthropic Messages API

Chuizi.AI proxies the Anthropic Messages API as a native passthrough. Your request body goes directly to Anthropic with zero format conversion, and automatic failover ensures high availability. The response comes back in Anthropic's own format, including streaming SSE event types.

This matters if you use Claude Code, Cursor, Cline, or OpenCode. These tools expect the Anthropic Messages API, not OpenAI's format. With Chuizi.AI, you set two environment variables and everything works.

Endpoints

Method	Path	Description
`POST`	`/anthropic/v1/messages`	Create a message (streaming and non-streaming)
`GET`	`/anthropic/v1/models`	List available Anthropic models

Authentication

Chuizi.AI accepts your ck- API key through two headers. Use whichever your tool expects:

Header	Format	Example
`x-api-key`	`ck-your-key-here`	Anthropic SDK default
`Authorization`	`Bearer ck-your-key-here`	OpenAI convention

Both resolve to the same user account, balance, and rate limits.

Required Headers

Header	Value	Notes
`anthropic-version`	`2023-06-01`	Required. Passed through to upstream.
`content-type`	`application/json`	Required for POST requests.

How It Differs from OpenAI `/v1/chat/completions`

If you are used to the OpenAI Chat Completions format, these are the key structural differences in Anthropic's Messages API:

Concept	OpenAI Chat Completions	Anthropic Messages
System prompt	`messages` array entry with `role: "system"`	Top-level `system` field (string or array of blocks)
Content format	String or array of content parts	Always an array of content blocks (`[{type: "text", text: "..."}]`)
Stop indicator	`finish_reason: "stop"`	`stop_reason: "end_turn"`
Max tokens	Optional (`max_tokens` or `max_completion_tokens`)	Required (`max_tokens`)
Token usage	`usage.prompt_tokens`, `usage.completion_tokens`	`usage.input_tokens`, `usage.output_tokens`
Streaming events	`data: {"choices": [...]}` chunks	Typed events: `message_start`, `content_block_delta`, `message_delta`
Model prefix	`anthropic/claude-sonnet-4-6`	`claude-sonnet-4-6` (bare name) or `anthropic/claude-sonnet-4-6`

Request Format

config.json

json

{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "system": "You are a helpful assistant.",
  "messages": [
    {
      "role": "user",
      "content": "Explain how TCP handshakes work."
    }
  ]
}

Supported Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Model name. Bare names (`claude-sonnet-4-6`) and prefixed names (`anthropic/claude-sonnet-4-6`) both work.
`max_tokens`	integer	Yes	Maximum tokens to generate.
`messages`	array	Yes	Conversation messages.
`system`	string or array	No	System prompt. Can be a string or an array of content blocks (useful for caching).
`stream`	boolean	No	Enable SSE streaming. Default: `false`.
`temperature`	number	No	Sampling temperature, 0-1.
`top_p`	number	No	Nucleus sampling threshold.
`top_k`	integer	No	Top-K sampling.
`stop_sequences`	array	No	Custom stop sequences.
`tools`	array	No	Tool definitions for function calling.
`tool_choice`	object	No	Tool selection strategy.
`metadata`	object	No	Request metadata (e.g., `user_id` for abuse tracking).

Response Format

Non-streaming

config.json

json

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "TCP uses a three-way handshake to establish a connection..."
    }
  ],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150
  }
}

Streaming

When stream: true, the response uses Anthropic's SSE event types:

event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","model":"claude-sonnet-4-6","usage":{"input_tokens":25,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"TCP uses"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" a three-way"}}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":150}}

event: message_stop
data: {"type":"message_stop"}

Prompt Caching

Anthropic supports explicit prompt caching via cache_control blocks. You can mark system prompts, messages, and tool definitions as cacheable. Cached content costs significantly less on subsequent requests (up to 90% savings).

Add cache_control: {"type": "ephemeral"} to any content block you want cached:

config.json

json

{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "You are a senior code reviewer. Here are the project conventions: ...(long text)...",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {"role": "user", "content": "Review this pull request."}
  ]
}

You can also cache tool definitions and conversation history:

config.json

json

{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "tools": [
    {
      "name": "search_codebase",
      "description": "Search the codebase for relevant files",
      "input_schema": {"type": "object", "properties": {"query": {"type": "string"}}},
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Here is the full file content: ...(long text)...",
          "cache_control": {"type": "ephemeral"}
        }
      ]
    }
  ]
}

When caching takes effect, the response usage includes additional fields:

Field	Description
`cache_creation_input_tokens`	Tokens written to cache (charged at 1.25x input price)
`cache_read_input_tokens`	Tokens read from cache (charged at 0.1x input price)

Code Examples

terminal

bash

curl -X POST https://api.chuizi.ai/anthropic/v1/messages \
  -H "x-api-key: ck-your-key-here" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

terminal

bash

curl -X POST https://api.chuizi.ai/anthropic/v1/messages \
  -H "x-api-key: ck-your-key-here" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about programming."}
    ]
  }'

example.py

python

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.chuizi.ai/anthropic",
    api_key="ck-your-key-here",
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain recursion in one paragraph."}
    ],
)
print(message.content[0].text)

example.py

python

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.chuizi.ai/anthropic",
    api_key="ck-your-key-here",
)

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short story."}
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

index.mjs

javascript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://api.chuizi.ai/anthropic",
  apiKey: "ck-your-key-here",
});

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "What is WebAssembly?" },
  ],
});
console.log(message.content[0].text);

Claude Code / Cursor / Cline Setup

Claude Code

Add to your shell profile (~/.zshrc or ~/.bashrc):

terminal

bash

export ANTHROPIC_BASE_URL=https://api.chuizi.ai/anthropic
export ANTHROPIC_API_KEY=ck-your-key-here

Restart your terminal. Claude Code works immediately with no other changes.

Cursor

Open Cursor Settings.
Go to Models > Anthropic.
Set API Base URL to https://api.chuizi.ai/anthropic.
Set API Key to your ck- key.

Cline (VS Code Extension)

Open Cline settings in VS Code.
Select Anthropic as the provider.
Set Base URL to https://api.chuizi.ai/anthropic.
Set API Key to your ck- key.

Why Native Passthrough Matters

Most API gateways (including OpenRouter) only support the OpenAI Chat Completions format. When you send a request for a Claude model, they convert your OpenAI-format request into Anthropic format behind the scenes, then convert the response back. This translation layer:

Loses Anthropic-specific features (prompt caching, extended thinking, citation blocks)
Adds latency from format conversion
Can introduce subtle bugs in edge cases (tool calling, multi-turn conversations)

Chuizi.AI's /anthropic endpoint does none of this. Your request goes to Anthropic exactly as you wrote it. The response comes back exactly as Anthropic sent it. The only thing Chuizi.AI touches is the authentication key and billing extraction.

Error Format

Errors follow Anthropic's error format:

config.json

json

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens is required"
  }
}

Error Type	HTTP Status	Description
`invalid_request_error`	400	Malformed request or missing required fields
`authentication_error`	401	Invalid or missing API key
`permission_error`	403	API key does not have access to the requested model
`not_found_error`	404	Model not found
`rate_limit_error`	429	Rate limit exceeded
`api_error`	500	Upstream provider error

For full error code details, see Error Codes. To learn about prompt caching savings, see Cache Discount Pricing.

Next Steps

Choosing a Protocol — Compare OpenAI, Anthropic, and Gemini protocols side by side
Claude Code Integration — Detailed setup guide for Claude Code with Chuizi.AI
Cache Discount Pricing — Save up to 90% on input costs with Anthropic prompt caching

Anthropic Messages API

Endpoints

Authentication

Required Headers

How It Differs from OpenAI /v1/chat/completions

Request Format

Supported Parameters

Response Format

Non-streaming

Streaming

Prompt Caching

Code Examples

Claude Code / Cursor / Cline Setup

Claude Code

Cursor

Cline (VS Code Extension)

Why Native Passthrough Matters

Error Format

Next Steps

How It Differs from OpenAI `/v1/chat/completions`