OpenAI Whisper
OpenAI
OpenAI offers two STT paths. GPT-4o Transcribe (API) posts 8.9% WER on Artificial Analysis benchmarks — the highest accuracy of any managed API. gpt-4o-mini-transcribe at $0.003/min is the best price-performance option for OpenAI ecosystem teams. gpt-realtime-whisper handles live transcription at $0.017/min. Self-hosted Whisper Large-v3 (MIT license, open weights) runs free on your own GPU — the industry baseline open-source model available via HuggingFace, or hosted through Groq, Fireworks, and Replicate at $0.0003–$0.003/min. Note: Whisper Large-v3 is no longer the accuracy leader on any major benchmark, though it remains the dominant self-hosted choice by usage volume. Best for: batch transcription where accuracy matters most, feeding transcripts into GPT workflows, and self-hosted deployments requiring full control.
API (managed): gpt-4o-mini-transcribe $0.003/min ($0.18/hr). gpt-4o-transcribe $0.006/min ($0.36/hr). gpt-realtime-whisper $0.017/min. Self-hosted Whisper: free (MIT, run on your own GPU). Via Groq: from $0.0003/min.
Related platforms
Amazon Transcribe
Amazon Web Services
AWS's managed STT — deepest AWS ecosystem integration, HIPAA-eligible, call analytics, and medical model.
AssemblyAI
AssemblyAI
Speech AI platform for transcription and audio intelligence.
Deepgram
Deepgram
Enterprise speech-to-text and voice AI platform.
Gladia
Gladia
#1 async STT accuracy in 2026 — Solaria-1 with 29% lower WER, 100+ languages, EU data residency.