GPT 4o Transcribe

OpenAI
openai/gpt-4o-transcribe

GPT-4o transcription — more accurate than Whisper

Context Window

Max Output

About this model

GPT-4o Transcribe leverages GPT-4o's multimodal capabilities to significantly outperform Whisper-1 in transcription accuracy. Supports structured output (JSON Schema) for formatted data extraction.

Supports mp3, mp4, m4a, wav, webm audio formats up to 25MB.

Highlights

Outperforms Whisper
Structured output
Multilingual
Long audio support

Best For

Meeting minutesInterview transcriptionContent extractionMultilingual recognition
2025-03-20

Capabilities

STT

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$6.30
Output / 1M$10.50

Final prices shown

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="openai/gpt-4o-transcribe",
        file=f,
    )
print(transcript.text)

FAQ

Related Models