Qwen3.6 Flash

Qwen
qwen/qwen3.6-flash

1M multimodal fast model

Context Window

1.0M

1,048,576 tokens

Max Output

66K

65,536 tokens

About this model

Qwen 3.6 Flash is the fast, cost-efficient model in the Qwen 3.6 family. It supports a 1M token context window and accepts text, image, and video inputs.

It is a strong default for high-throughput apps, long-context workloads, RAG pipelines, and lightweight agent calls.

Highlights

1M context
Image/video input
Low cost
High throughput

Best For

Default chatRAG retrievalBatch processingMultimodal understanding
2026-04-02MoE TransformerProprietary

Capabilities

ChatVisionReasoningCodetoolscache

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$0.175
Output / 1M$1.05
Cache Read$0.035
Cache Write$0.219

Final prices shown

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="qwen/qwen3.6-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

FAQ

Related Models