Gemini 2.5 Flash

Google
google/gemini-2.5-flash

Lightning fast, 1M context

Context Window

1.0M

1,000,000 tokens

Max Output

66K

65,536 tokens

About this model

Gemini 2.5 Flash is Google's high-speed reasoning model offering powerful reasoning at extremely low cost. 1M token context window with input pricing at just $0.32/M tokens.

An excellent alternative to Gemini 2.5 Pro in terms of speed and cost. Supports multimodal input including text, images, and code. Implicit Context Caching is automatic.

Ideal for high-throughput production environments.

Highlights

1M context
$0.32/M ultra-low price
Strong reasoning
Auto caching

Best For

High-throughput APIsCode generationDocument summarizationData processing pipelines
2025-04-17MoE TransformerProprietary

Capabilities

ChatVisionReasoningCodepdftoolscache

Aliases

gemini-2.5-flash
gemini-flash

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$0.32
Output / 1M$2.63
Cache Read$0.03

Final prices shown

Quick Start

main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

FAQ

Related Models

Gemini 2.5 Flash β€” Pricing, Context, Capabilities | Chuizi AI