Gemini 3.5 Flash
Google
google/gemini-3.5-flash
Stable frontier-speed model, 1M context
Context Window
1.0M
1,048,576 tokens
Max Output
66K
65,536 tokens
About this model
Gemini 3.5 Flash — stable frontier-speed model via Vertex
This model supports a 1,048,576-token input context and up to 65,536 output tokens. It accepts text, images, video, audio, and PDFs, and supports tool calling, structured outputs, code execution, search grounding, and thinking. It is suited for sustained agentic workflows, complex coding loops, and long-horizon tasks at scale.
Access it through Chuizi.AI with a single ck- API key — no separate Google account needed.
Highlights
1M context window
66K max output
Stable release
Multimodal input
Tool calling and thinking
Best For
Complex coding loopsAgent workflowsLong-context analysisMultimodal understanding
2026-05-19
Capabilities
ChatVisionReasoningCodepdftoolscache
Pricing (per 1M tokens)
| Pricing (per 1M tokens) | / 1M tokens |
|---|---|
| Input / 1M | $1.58 |
| Output / 1M | $9.45 |
| Cache Read | $0.158 |
Final prices shown
Quick Start
main.py
from openai import OpenAI client = OpenAI( base_url="https://api.chuizi.ai/v1", api_key="ck-your-key-here", ) response = client.chat.completions.create( model="google/gemini-3.5-flash", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)