Deepseek V4 Flash

DeepSeek

deepseek/deepseek-v4-flash

V4 Flash, 1M context

Context Window

1.0M

1,048,576 tokens

Max Output

384K

384,000 tokens

About this model

DeepSeek V4 Flash is the fast, cost-efficient member of the V4 family with the same 1M token context window. It is a strong fit for daily coding, batch analysis, draft generation, and cost-sensitive agent workflows.

Served through Bailian, it is well suited as a default fast tier or lightweight smart-routing option.

Highlights

1M context

V4 fast tier

Low cost

Batch friendly

Best For

Daily codingBatch analysisLow-cost reasoningLong-doc summaries

2026-04-24MoEMoE TransformerProprietary

Official Page

Capabilities

ChatReasoningCodecache

Pricing (per 1M tokens)

Pricing (per 1M tokens)	/ 1M tokens
Input / 1M	$0.146
Output / 1M	$0.292
Cache Read	$0.029
Cache Write	$0.182

Final prices shown

Quick Start

main.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-v4-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Deepseek V4 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

Related Models

Deepseek Chat

Deepseek Reasoner

Deepseek V4 Pro

Deepseek V3.2

Deepseek V3.1

Deepseek V3

Deepseek V4 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

How do I get an API Key?

How does billing work?

What payment methods are supported?

Are there rate limits?

Does DeepSeek-R1 reasoning work?

Why is DeepSeek so cheap?

Related Models

Deepseek Chat

Deepseek Reasoner

Deepseek V4 Pro

Deepseek V3.2

Deepseek V3.1

Deepseek V3