Deepseek V4 Flash
DeepSeek
deepseek/deepseek-v4-flash
V4 Flash, 1M context
Context Window
1.0M
1,048,576 tokens
Max Output
384K
384,000 tokens
About this model
DeepSeek V4 Flash is the fast, cost-efficient member of the V4 family with the same 1M token context window. It is a strong fit for daily coding, batch analysis, draft generation, and cost-sensitive agent workflows.
Served through Bailian, it is well suited as a default fast tier or lightweight smart-routing option.
Highlights
1M context
V4 fast tier
Low cost
Batch friendly
Best For
Daily codingBatch analysisLow-cost reasoningLong-doc summaries
2026-04-24MoEMoE TransformerProprietary
Capabilities
ChatReasoningCodecache
Pricing (per 1M tokens)
| Pricing (per 1M tokens) | / 1M tokens |
|---|---|
| Input / 1M | $0.146 |
| Output / 1M | $0.292 |
| Cache Read | $0.029 |
| Cache Write | $0.182 |
Final prices shown
Quick Start
main.py
from openai import OpenAI client = OpenAI( base_url="https://api.chuizi.ai/v1", api_key="ck-your-key-here", ) response = client.chat.completions.create( model="deepseek/deepseek-v4-flash", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)