Vision

Supported Models

Chuizi.AI supports 59+ models with vision capabilities, including:

ProviderModels
OpenAI / AzureGPT-4.1, GPT-4.1-mini, GPT-4o, GPT-5, o3, o4-mini
AnthropicClaude Opus 4, Claude Sonnet 4, Claude Sonnet 3.5, Claude Haiku 3.5
GoogleGemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash
DeepSeekDeepSeek V3 (via compatible endpoints)
QwenQwen-VL-Max, Qwen-VL-Plus
OtherLlama 4 Scout, Llama 4 Maverick, Nova Pro/Lite

Check GET /v1/models for the full list -- models with "vision": true in their capabilities support image input.

Request Format

Images are sent as part of the content array in a message, alongside text:

config.json
json
{
  "model": "openai/gpt-4.1",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/photo.jpg"
          }
        }
      ]
    }
  ],
  "max_tokens": 1024
}

Image Input Methods

Public URL

Pass a publicly accessible URL. The model fetches the image directly:

config.json
json
{
  "type": "image_url",
  "image_url": {
    "url": "https://example.com/photo.jpg"
  }
}

Supported formats: JPEG, PNG, GIF, WebP.

Base64 Data URL

Encode the image as base64 and embed it in the request using a data URL:

config.json
json
{
  "type": "image_url",
  "image_url": {
    "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQ..."
  }
}

Use base64 when:

  • The image is not publicly accessible.
  • You want to avoid an extra HTTP round-trip.
  • The image is generated dynamically.

Code Examples

example.py
python
from openai import OpenAI
import base64

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

# Using a public URL
response = client.chat.completions.create(
    model="openai/gpt-4.1",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail."},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/photo.jpg"},
                },
            ],
        }
    ],
    max_tokens=1024,
)
print(response.choices[0].message.content)

# Using base64
with open("screenshot.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract all text from this screenshot."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{image_data}",
                        "detail": "high",
                    },
                },
            ],
        }
    ],
    max_tokens=2048,
)
print(response.choices[0].message.content)

Next Steps