GPT 4o Transcribe
OpenAI
openai/gpt-4o-transcribe
GPT-4o transcription β more accurate than Whisper
Context Window
β
Max Output
β
About this model
GPT-4o Transcribe leverages GPT-4o's multimodal capabilities to significantly outperform Whisper-1 in transcription accuracy. Supports structured output (JSON Schema) for formatted data extraction.
Supports mp3, mp4, m4a, wav, webm audio formats up to 25MB.
Highlights
Outperforms Whisper
Structured output
Multilingual
Long audio support
Best For
Meeting minutesInterview transcriptionContent extractionMultilingual recognition
2025-03-20
Capabilities
STT
Aliases
gpt-4o-transcribePricing (per 1M tokens)
| Pricing (per 1M tokens) | / 1M tokens |
|---|---|
| Input / 1M | $6.30 |
| Output / 1M | $10.50 |
Final prices shown
Quick Start
main.py
from openai import OpenAI client = OpenAI( base_url="https://api.chuizi.ai/v1", api_key="ck-your-key-here", ) with open("audio.mp3", "rb") as f: transcript = client.audio.transcriptions.create( model="openai/gpt-4o-transcribe", file=f, ) print(transcript.text)