Gemini 2.5 Flash

Google

google/gemini-2.5-flash

Lightning fast, 1M context

Context Window

1.0M

1,000,000 tokens

Max Output

66K

65,536 tokens

About this model

Gemini 2.5 Flash is Google's high-speed reasoning model offering powerful reasoning at extremely low cost. 1M token context window with input pricing at just $0.32/M tokens.

An excellent alternative to Gemini 2.5 Pro in terms of speed and cost. Supports multimodal input including text, images, and code. Implicit Context Caching is automatic.

Ideal for high-throughput production environments.

Highlights

1M context

$0.32/M ultra-low price

Strong reasoning

Auto caching

Best For

High-throughput APIsCode generationDocument summarizationData processing pipelines

2025-04-17MoE TransformerProprietary

Capabilities

ChatVisionReasoningCodepdftoolscache

Pricing (per 1M tokens)

Pricing (per 1M tokens)	/ 1M tokens
Input / 1M	$0.315
Output / 1M	$2.63
Cache Read	$0.032

Final prices shown

Quick Start

main.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Gemini 2.5 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

Related Models

Gemini 2.5 Pro

Gemini 3.5 Flash

Gemini 3 Flash Preview

Gemini 3.1 Pro Preview

Gemini 3.1 Flash Image Preview

Gemini 3.1 Flash Lite Preview

Gemini 2.5 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

How do I get an API Key?

How does billing work?

What payment methods are supported?

Are there rate limits?

Is Google Search grounding supported?

What about Gemini's 1M/2M context window?

Related Models

Gemini 2.5 Pro

Gemini 3.5 Flash

Gemini 3 Flash Preview

Gemini 3.1 Pro Preview

Gemini 3.1 Flash Image Preview

Gemini 3.1 Flash Lite Preview