Billing Model
Chuizi.AI charges upstream provider cost x 1.05 with no hidden fees. All billing calculations use 8-decimal precision to ensure financial accuracy.
Three Billing Types
| Billing Type | Use Case | Unit |
|---|---|---|
| Per-token | Chat, Embedding, and text models | Per 1M tokens |
| Per-request | Image generation, TTS | Per request |
| Per-second | Speech-to-text (Whisper) | Per second of audio |
Per-Token Billing
Most models (Chat, Reasoning, Embedding) use per-token billing. Input tokens and output tokens are priced separately.
Formula:
cost = (input_tokens x input_price + output_tokens x output_price) x 1.05
Example — Claude Sonnet 4-6:
| Item | Count | Price (per 1M tokens) | Subtotal |
|---|---|---|---|
| Input tokens | 2,000 | $3.00 | $0.006000 |
| Output tokens | 500 | $15.00 | $0.007500 |
| Subtotal | — | — | $0.013500 |
| x 1.05 multiplier | — | — | $0.014175 |
If the request uses prompt caching, the cost also includes cache_read_tokens and cache_write_tokens. See Cache Discount Pricing for details.
Token Subtypes
Some models report additional usage fields:
| Field | Description | Common Models |
|---|---|---|
input_tokens | Input token count | All models |
output_tokens | Output token count | All models |
cache_read_tokens | Cache-hit token count | Anthropic, OpenAI, DeepSeek |
cache_write_tokens | Cache-write token count | Anthropic |
reasoning_tokens | Reasoning chain token count | o3, o4-mini, DeepSeek R1 |
Reasoning tokens are billed at the output price and included in the output_tokens total. For details on how billing is processed after each request, see Billing Flow.
Per-Request Billing
Image generation and speech synthesis models charge a fixed price per request, regardless of token count.
Common model prices:
| Model | Billing Type | Upstream Price | Actual (x1.05) |
|---|---|---|---|
| Imagen 4.0 | Per request | $0.040 | $0.042 |
| DALL-E 3 (1024x1024) | Per request | $0.040 | $0.042 |
| Nova Canvas | Per request | $0.040 | $0.042 |
| GPT-4o-mini-tts | Per request | Per character | Per character x 1.05 |
Per-Second Billing
Speech-to-text models are billed by audio duration.
| Model | Upstream Price | Actual (x1.05) |
|---|---|---|
| Whisper (gpt-4o-transcribe) | $0.00006/sec | $0.000063/sec |
Query Model Prices
Call GET /v1/models to retrieve real-time pricing for all models:
curl https://api.chuizi.ai/v1/models \ -H "Authorization: Bearer ck-your-key"
Each model in the response includes a pricing field:
{ "id": "anthropic/claude-sonnet-4-6", "pricing": { "input": "3.00", "output": "15.00", "cache_read": "0.30", "cache_write": "3.75", "unit": "per_1m_tokens" } }
All prices shown are upstream costs. Your actual bill = displayed price x 1.05.
The 1.05x Multiplier
Chuizi.AI applies a flat 5% service fee covering gateway operations, multi-protocol support, smart routing, and automatic failover. Compared to managing API keys across multiple providers yourself, the 5% markup is typically more cost-effective.
See Model Pricing for the full price list.
Next Steps
- Cache Discount Pricing — Save up to 90% on input costs with prompt caching
- Billing Flow — How balance reservations and reconciliation work
- Model Pricing — Complete price table for all supported models