Qwen3.6 Flash

Qwen

qwen/qwen3.6-flash

1M multimodal fast model

Context Window

1.0M

1,048,576 tokens

Max Output

66K

65,536 tokens

About this model

Qwen 3.6 Flash is the fast, cost-efficient model in the Qwen 3.6 family. It supports a 1M token context window and accepts text, image, and video inputs.

It is a strong default for high-throughput apps, long-context workloads, RAG pipelines, and lightweight agent calls.

Highlights

1M context

Image/video input

Low cost

High throughput

Best For

Default chatRAG retrievalBatch processingMultimodal understanding

2026-04-02MoE TransformerProprietary

Official Page

Capabilities

ChatVisionReasoningCodetoolscache

Pricing (per 1M tokens)

Pricing (per 1M tokens)	/ 1M tokens
Input / 1M	$0.175
Output / 1M	$1.05
Cache Read	$0.035
Cache Write	$0.219

Final prices shown

Quick Start

main.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="qwen/qwen3.6-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Qwen3.6 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

Related Models

Qwen Max

Qwen Plus

Qwen Turbo

Qwen2.5 Coder 32b

Qwen VL Max

Qwen3 Max

Qwen3.6 Flash

About this model

Highlights

Best For

Capabilities

Pricing (per 1M tokens)

Quick Start

FAQ

How do I get an API Key?

How does billing work?

What payment methods are supported?

Are there rate limits?

Related Models

Qwen Max

Qwen Plus

Qwen Turbo

Qwen2.5 Coder 32b

Qwen VL Max

Qwen3 Max