Bark
Suno AI
Bark is an open-source text-to-audio model from Suno AI (MIT license) that is unique in generating not just speech but the full range of human vocalization — laughter, sighs, crying, gasps, and even background music — all from a single text prompt using audio tags like [laughs], [sighs], [music]. A generative model rather than a traditional TTS system, which means output is creative and variable (not deterministic). Slower than dedicated TTS models but unrivaled for expressive, character-rich audio content. Best for game prototypes, creative audio projects, and applications where emotional expression and non-speech sounds are as important as intelligibility.
Completely free (MIT license). Self-hosted. GPU recommended (VRAM-intensive). Available on Hugging Face for cloud inference.
Related platforms
Resemble AI
Resemble AI
Enterprise voice cloning with deepfake detection, watermarking, and Hollywood-grade synthesis.
Chatterbox
Resemble AI
MIT-licensed open-source TTS that beat ElevenLabs in blind tests — free self-hosted voice cloning.
Coqui XTTS
Coqui AI (Community)
Open-source multilingual voice cloning model — 17 languages, self-hosted, 6-second voice cloning.
Kokoro
Hexgrad
Open-source Apache 2.0 TTS model — 82M params, 210x real-time speed, runs free on any GPU.