Gemini API

Chuizi.AI proxies the Gemini API as a native passthrough. Requests follow Google's generateContent / streamGenerateContent format and go directly to Google with automatic failover for high availability. No format conversion happens in either direction.

Use this protocol when you work with the Google AI SDK or any tool that expects the Gemini API format.

Endpoints

Method	Path	Description
`POST`	`/gemini/v1beta/models/{model}:generateContent`	Generate content (non-streaming)
`POST`	`/gemini/v1beta/models/{model}:streamGenerateContent`	Generate content (streaming SSE)
`GET`	`/gemini/v1beta/models`	List available Gemini models

The {model} parameter is the bare model name (e.g., gemini-2.5-flash, gemini-2.5-pro). The provider/model prefix format (google/gemini-2.5-flash) also works.

Authentication

Chuizi.AI accepts your ck- API key through these headers:

Header	Format	Notes
`x-goog-api-key`	`ck-your-key-here`	Google AI SDK default
`Authorization`	`Bearer ck-your-key-here`	General convention

Both resolve to the same user account, balance, and rate limits.

Request Format

The Gemini API uses a contents array where each entry has a role (user or model) and parts (an array of content pieces).

config.json

json

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {"text": "Explain how garbage collection works in Go."}
      ]
    }
  ],
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 1024,
    "topP": 0.9,
    "topK": 40
  }
}

Supported Parameters

Parameter	Type	Required	Description
`contents`	array	Yes	Conversation turns. Each entry has `role` and `parts`.
`generationConfig`	object	No	Controls generation behavior.
`systemInstruction`	object	No	System-level instruction. Same format as a content entry.
`tools`	array	No	Tool/function definitions.
`toolConfig`	object	No	Tool calling configuration.
`safetySettings`	array	No	Content safety thresholds.

generationConfig Options

Field	Type	Description
`temperature`	number	Sampling temperature, 0-2.
`topP`	number	Nucleus sampling threshold.
`topK`	integer	Top-K sampling.
`maxOutputTokens`	integer	Maximum tokens to generate.
`stopSequences`	array	Up to 5 custom stop sequences.
`responseMimeType`	string	`text/plain` or `application/json` for JSON mode.
`responseSchema`	object	JSON schema for structured output (when `responseMimeType` is `application/json`).

Multi-turn Conversation

Alternate user and model roles for multi-turn conversations:

config.json

json

{
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "What is Rust?"}]
    },
    {
      "role": "model",
      "parts": [{"text": "Rust is a systems programming language focused on safety and performance."}]
    },
    {
      "role": "user",
      "parts": [{"text": "How does its ownership model work?"}]
    }
  ]
}

System Instruction

config.json

json

{
  "systemInstruction": {
    "parts": [
      {"text": "You are a database expert. Answer questions about SQL optimization."}
    ]
  },
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "How do I optimize a slow JOIN query?"}]
    }
  ]
}

Response Format

Non-streaming

config.json

json

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {"text": "Go uses a concurrent, tri-color mark-and-sweep garbage collector..."}
        ]
      },
      "finishReason": "STOP",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 12,
    "candidatesTokenCount": 180,
    "totalTokenCount": 192
  },
  "modelVersion": "gemini-2.5-flash"
}

Streaming

When using streamGenerateContent, the response is a series of SSE events. Each event contains a partial candidate:

data: {"candidates":[{"content":{"role":"model","parts":[{"text":"Go uses"}]},"index":0}]}

data: {"candidates":[{"content":{"role":"model","parts":[{"text":" a concurrent"}]},"index":0}]}

data: {"candidates":[{"content":{"role":"model","parts":[{"text":" garbage collector..."}]},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":12,"candidatesTokenCount":180,"totalTokenCount":192}}

Code Examples

terminal

bash

curl -X POST "https://api.chuizi.ai/gemini/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-goog-api-key: ck-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is WebAssembly?"}]
      }
    ],
    "generationConfig": {
      "maxOutputTokens": 512
    }
  }'

terminal

bash

curl -X POST "https://api.chuizi.ai/gemini/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
  -H "x-goog-api-key: ck-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Write a short poem about code."}]
      }
    ]
  }'

example.py

python

import google.generativeai as genai

genai.configure(
    api_key="ck-your-key-here",
    transport="rest",
    client_options={"api_endpoint": "api.chuizi.ai/gemini"},
)

model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content("Explain how DNS works.")
print(response.text)

example.py

python

import google.generativeai as genai

genai.configure(
    api_key="ck-your-key-here",
    transport="rest",
    client_options={"api_endpoint": "api.chuizi.ai/gemini"},
)

model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content("Write a tutorial outline.", stream=True)

for chunk in response:
    print(chunk.text, end="", flush=True)

index.mjs

javascript

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("ck-your-key-here");
// Point to Chuizi.AI
const model = genAI.getGenerativeModel(
  { model: "gemini-2.5-flash" },
  { baseUrl: "https://api.chuizi.ai/gemini" },
);

const result = await model.generateContent("What is gRPC?");
console.log(result.response.text());

Key Differences from OpenAI Format

Concept	OpenAI Chat Completions	Gemini API
Message container	`messages` array	`contents` array
Role names	`system`, `user`, `assistant`	`user`, `model` (system uses `systemInstruction`)
Content format	`content` (string or parts)	`parts` array (`[{text: "..."}]`)
Max tokens	`max_tokens`	`generationConfig.maxOutputTokens`
Temperature	Top-level `temperature`	`generationConfig.temperature`
Response text	`choices[0].message.content`	`candidates[0].content.parts[0].text`
Token usage	`usage.prompt_tokens`	`usageMetadata.promptTokenCount`
Model in URL	Body parameter `model`	URL path segment `/models/{model}:generateContent`

Error Format

Errors follow Google's error format:

config.json

json

{
  "error": {
    "code": 400,
    "message": "Invalid request: contents is required and must contain at least one item",
    "status": "INVALID_ARGUMENT"
  }
}

Status	Code	Description
`INVALID_ARGUMENT`	400	Malformed request or missing required fields
`UNAUTHENTICATED`	401	Invalid or missing API key
`PERMISSION_DENIED`	403	API key does not have access to the requested model
`NOT_FOUND`	404	Model not found
`RESOURCE_EXHAUSTED`	429	Rate limit exceeded
`INTERNAL`	500	Upstream provider error

Next Steps

Choosing a Protocol — Compare all supported protocols side by side
Anthropic Messages API — Native passthrough for Claude models
Streaming Guide — Handle SSE streaming responses from Gemini