Qwen3.6 Flash
Qwen
qwen/qwen3.6-flash
1M multimodal fast model
Context Window
1.0M
1,048,576 tokens
Max Output
66K
65,536 tokens
About this model
Qwen 3.6 Flash is the fast, cost-efficient model in the Qwen 3.6 family. It supports a 1M token context window and accepts text, image, and video inputs.
It is a strong default for high-throughput apps, long-context workloads, RAG pipelines, and lightweight agent calls.
Highlights
1M context
Image/video input
Low cost
High throughput
Best For
Default chatRAG retrievalBatch processingMultimodal understanding
2026-04-02MoE TransformerProprietary
Capabilities
ChatVisionReasoningCodetoolscache
Pricing (per 1M tokens)
| Pricing (per 1M tokens) | / 1M tokens |
|---|---|
| Input / 1M | $0.175 |
| Output / 1M | $1.05 |
| Cache Read | $0.035 |
| Cache Write | $0.219 |
Final prices shown
Quick Start
main.py
from openai import OpenAI client = OpenAI( base_url="https://api.chuizi.ai/v1", api_key="ck-your-key-here", ) response = client.chat.completions.create( model="qwen/qwen3.6-flash", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)