Gemini 3.5 Flash

Google
google/gemini-3.5-flash

Stable frontier-speed model, 1M context

Context Window

1.0M

1,048,576 tokens

Max Output

66K

65,536 tokens

About this model

Gemini 3.5 Flash — stable frontier-speed model via Vertex

This model supports a 1,048,576-token input context and up to 65,536 output tokens. It accepts text, images, video, audio, and PDFs, and supports tool calling, structured outputs, code execution, search grounding, and thinking. It is suited for sustained agentic workflows, complex coding loops, and long-horizon tasks at scale.

Access it through Chuizi.AI with a single ck- API key — no separate Google account needed.

Highlights

1M context window
66K max output
Stable release
Multimodal input
Tool calling and thinking

Best For

Complex coding loopsAgent workflowsLong-context analysisMultimodal understanding
2026-05-19

Capabilities

ChatVisionReasoningCodepdftoolscache

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$1.58
Output / 1M$9.45
Cache Read$0.158

Final prices shown

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="google/gemini-3.5-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

FAQ

Related Models