Qwen Omni Turbo
Qwen
qwen/qwen-omni-turbo
Omni understanding: text+image+audio
Context Window
32K
32,000 tokens
Max Output
8K
8,192 tokens
About this model
Qwen Omni Turbo is the Qwen multi-modal understanding model, supporting text, image, and audio input simultaneously.
Ideal for scenarios requiring cross-modal understanding. Access via Chuizi.AI with a ck- API key.
Highlights
Multi-modal
Text+image+audio
Fused understanding
General purpose
Best For
Multi-modal analysisAudio-visual understandingContent moderationCross-modal retrieval
2025-10-01
Capabilities
ChatVisionAudiotools
Aliases
qwen-omni-turboPricing (per 1M tokens)
| Pricing (per 1M tokens) | / 1M tokens |
|---|---|
| Input / 1M | $2.10 |
| Output / 1M | $6.30 |
Final prices shown
Quick Start
main.py
from openai import OpenAI client = OpenAI( base_url="https://api.chuizi.ai/v1", api_key="ck-your-key-here", ) response = client.chat.completions.create( model="qwen/qwen-omni-turbo", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)