Gladia
Gladia
Gladia achieved the top position in the 2026 async STT benchmark (8 providers, 7 datasets, 74 hours of audio), with Solaria-1 delivering an average 29% lower Word Error Rate on conversational speech and up to 3× lower Diarization Error Rate than competing APIs. Supports 100+ languages including 42 unavailable on any other mainstream API (Bengali, Punjabi, Tagalog, Persian, Kazakh, Haitian Creole). Native code-switching across the full language set. Speaker diarization powered by pyannoteAI's Precision-2 model is bundled into the base rate (not sold as an add-on). EU data residency available — a key differentiator for European teams. Bundled pricing philosophy: ~$0.50/hr on Scaling plan includes most features without stacking add-on costs. Best for: async transcription of messy real-world audio with multilingual speakers, accents, and background noise.
Free: usage credits at signup. Starter: $0.0125/min. Growth: as low as $0.00417/min at volume. Scaling: ~$0.50/hr bundled (most features included). Enterprise: custom with EU residency.
Related platforms
Amazon Transcribe
Amazon Web Services
AWS's managed STT — deepest AWS ecosystem integration, HIPAA-eligible, call analytics, and medical model.
AssemblyAI
AssemblyAI
Speech AI platform for transcription and audio intelligence.
Deepgram
Deepgram
Enterprise speech-to-text and voice AI platform.
Google Cloud STT
Google's enterprise STT API — Chirp 3 HD with 125+ languages, streaming and batch, GCP ecosystem.