Qwen Omni Turbo

Qwen
qwen/qwen-omni-turbo

Omni understanding: text+image+audio

Context Window

32K

32,000 tokens

Max Output

8K

8,192 tokens

About this model

Qwen Omni Turbo is the Qwen multi-modal understanding model, supporting text, image, and audio input simultaneously.

Ideal for scenarios requiring cross-modal understanding. Access via Chuizi.AI with a ck- API key.

Highlights

Multi-modal
Text+image+audio
Fused understanding
General purpose

Best For

Multi-modal analysisAudio-visual understandingContent moderationCross-modal retrieval
2025-10-01

Capabilities

ChatVisionAudiotools

Aliases

qwen-omni-turbo

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$2.10
Output / 1M$6.30

Final prices shown

Quick Start

main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="qwen/qwen-omni-turbo",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

FAQ

Related Models

Qwen Omni Turbo β€” Pricing, Context, Capabilities | Chuizi AI