Gemini API
Chuizi.AI proxies the Gemini API as a native passthrough. Requests follow Google's generateContent / streamGenerateContent format and go directly to Google with automatic failover for high availability. No format conversion happens in either direction.
Use this protocol when you work with the Google AI SDK or any tool that expects the Gemini API format.
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /gemini/v1beta/models/{model}:generateContent | Generate content (non-streaming) |
POST | /gemini/v1beta/models/{model}:streamGenerateContent | Generate content (streaming SSE) |
GET | /gemini/v1beta/models | List available Gemini models |
The {model} parameter is the bare model name (e.g., gemini-2.5-flash, gemini-2.5-pro). The provider/model prefix format (google/gemini-2.5-flash) also works.
Authentication
Chuizi.AI accepts your ck- API key through these headers:
| Header | Format | Notes |
|---|---|---|
x-goog-api-key | ck-your-key-here | Google AI SDK default |
Authorization | Bearer ck-your-key-here | General convention |
Both resolve to the same user account, balance, and rate limits.
Request Format
The Gemini API uses a contents array where each entry has a role (user or model) and parts (an array of content pieces).
{ "contents": [ { "role": "user", "parts": [ {"text": "Explain how garbage collection works in Go."} ] } ], "generationConfig": { "temperature": 0.7, "maxOutputTokens": 1024, "topP": 0.9, "topK": 40 } }
Supported Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
contents | array | Yes | Conversation turns. Each entry has role and parts. |
generationConfig | object | No | Controls generation behavior. |
systemInstruction | object | No | System-level instruction. Same format as a content entry. |
tools | array | No | Tool/function definitions. |
toolConfig | object | No | Tool calling configuration. |
safetySettings | array | No | Content safety thresholds. |
generationConfig Options
| Field | Type | Description |
|---|---|---|
temperature | number | Sampling temperature, 0-2. |
topP | number | Nucleus sampling threshold. |
topK | integer | Top-K sampling. |
maxOutputTokens | integer | Maximum tokens to generate. |
stopSequences | array | Up to 5 custom stop sequences. |
responseMimeType | string | text/plain or application/json for JSON mode. |
responseSchema | object | JSON schema for structured output (when responseMimeType is application/json). |
Multi-turn Conversation
Alternate user and model roles for multi-turn conversations:
{ "contents": [ { "role": "user", "parts": [{"text": "What is Rust?"}] }, { "role": "model", "parts": [{"text": "Rust is a systems programming language focused on safety and performance."}] }, { "role": "user", "parts": [{"text": "How does its ownership model work?"}] } ] }
System Instruction
{ "systemInstruction": { "parts": [ {"text": "You are a database expert. Answer questions about SQL optimization."} ] }, "contents": [ { "role": "user", "parts": [{"text": "How do I optimize a slow JOIN query?"}] } ] }
Response Format
Non-streaming
{ "candidates": [ { "content": { "role": "model", "parts": [ {"text": "Go uses a concurrent, tri-color mark-and-sweep garbage collector..."} ] }, "finishReason": "STOP", "index": 0 } ], "usageMetadata": { "promptTokenCount": 12, "candidatesTokenCount": 180, "totalTokenCount": 192 }, "modelVersion": "gemini-2.5-flash" }
Streaming
When using streamGenerateContent, the response is a series of SSE events. Each event contains a partial candidate:
data: {"candidates":[{"content":{"role":"model","parts":[{"text":"Go uses"}]},"index":0}]}
data: {"candidates":[{"content":{"role":"model","parts":[{"text":" a concurrent"}]},"index":0}]}
data: {"candidates":[{"content":{"role":"model","parts":[{"text":" garbage collector..."}]},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":12,"candidatesTokenCount":180,"totalTokenCount":192}}Code Examples
curl -X POST "https://api.chuizi.ai/gemini/v1beta/models/gemini-2.5-flash:generateContent" \ -H "x-goog-api-key: ck-your-key-here" \ -H "Content-Type: application/json" \ -d '{ "contents": [ { "role": "user", "parts": [{"text": "What is WebAssembly?"}] } ], "generationConfig": { "maxOutputTokens": 512 } }'
curl -X POST "https://api.chuizi.ai/gemini/v1beta/models/gemini-2.5-flash:streamGenerateContent" \ -H "x-goog-api-key: ck-your-key-here" \ -H "Content-Type: application/json" \ -d '{ "contents": [ { "role": "user", "parts": [{"text": "Write a short poem about code."}] } ] }'
import google.generativeai as genai genai.configure( api_key="ck-your-key-here", transport="rest", client_options={"api_endpoint": "api.chuizi.ai/gemini"}, ) model = genai.GenerativeModel("gemini-2.5-flash") response = model.generate_content("Explain how DNS works.") print(response.text)
import google.generativeai as genai genai.configure( api_key="ck-your-key-here", transport="rest", client_options={"api_endpoint": "api.chuizi.ai/gemini"}, ) model = genai.GenerativeModel("gemini-2.5-flash") response = model.generate_content("Write a tutorial outline.", stream=True) for chunk in response: print(chunk.text, end="", flush=True)
import { GoogleGenerativeAI } from "@google/generative-ai"; const genAI = new GoogleGenerativeAI("ck-your-key-here"); // Point to Chuizi.AI const model = genAI.getGenerativeModel( { model: "gemini-2.5-flash" }, { baseUrl: "https://api.chuizi.ai/gemini" }, ); const result = await model.generateContent("What is gRPC?"); console.log(result.response.text());
Key Differences from OpenAI Format
| Concept | OpenAI Chat Completions | Gemini API |
|---|---|---|
| Message container | messages array | contents array |
| Role names | system, user, assistant | user, model (system uses systemInstruction) |
| Content format | content (string or parts) | parts array ([{text: "..."}]) |
| Max tokens | max_tokens | generationConfig.maxOutputTokens |
| Temperature | Top-level temperature | generationConfig.temperature |
| Response text | choices[0].message.content | candidates[0].content.parts[0].text |
| Token usage | usage.prompt_tokens | usageMetadata.promptTokenCount |
| Model in URL | Body parameter model | URL path segment /models/{model}:generateContent |
Error Format
Errors follow Google's error format:
{ "error": { "code": 400, "message": "Invalid request: contents is required and must contain at least one item", "status": "INVALID_ARGUMENT" } }
| Status | Code | Description |
|---|---|---|
INVALID_ARGUMENT | 400 | Malformed request or missing required fields |
UNAUTHENTICATED | 401 | Invalid or missing API key |
PERMISSION_DENIED | 403 | API key does not have access to the requested model |
NOT_FOUND | 404 | Model not found |
RESOURCE_EXHAUSTED | 429 | Rate limit exceeded |
INTERNAL | 500 | Upstream provider error |
Next Steps
- Choosing a Protocol — Compare all supported protocols side by side
- Anthropic Messages API — Native passthrough for Claude models
- Streaming Guide — Handle SSE streaming responses from Gemini