GPT 4o Transcribe

OpenAI
openai/gpt-4o-transcribe

GPT-4o transcription β€” more accurate than Whisper

Context Window

β€”

Max Output

β€”

About this model

GPT-4o Transcribe leverages GPT-4o's multimodal capabilities to significantly outperform Whisper-1 in transcription accuracy. Supports structured output (JSON Schema) for formatted data extraction.

Supports mp3, mp4, m4a, wav, webm audio formats up to 25MB.

Highlights

Outperforms Whisper
Structured output
Multilingual
Long audio support

Best For

Meeting minutesInterview transcriptionContent extractionMultilingual recognition
2025-03-20

Capabilities

STT

Aliases

gpt-4o-transcribe

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$6.30
Output / 1M$10.50

Final prices shown

Quick Start

main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="openai/gpt-4o-transcribe",
        file=f,
    )
print(transcript.text)

FAQ

Related Models

GPT 4o Transcribe β€” Pricing, Context, Capabilities | Chuizi AI