Editing
Chatterbox
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Infobox TTS model | name = Chatterbox | developer = [[Resemble AI]] | release_date = May 2025 | latest_version = Multilingual 2.0 | architecture = [[CosyVoice 2.0]]-based | parameters = 500 million | training_data = 500,000 hours cleaned data | languages = 23 languages (multilingual version) | voices = Zero-shot voice cloning | voice_cloning = Yes (5-second reference) | emotion_control = Yes (exaggeration parameter) | streaming = Yes | latency = Sub-200ms | license = [[MIT License]] | open_source = Yes | code_repository = [https://github.com/resemble-ai/chatterbox GitHub] | model_weights = [https://huggingface.co/ResembleAI/chatterbox Hugging Face] | demo = [https://huggingface.co/spaces/ResembleAI/Chatterbox HF Spaces] | website = [https://www.resemble.ai/chatterbox/ resemble.ai/chatterbox] }} '''Chatterbox''' is an open-source [[text-to-speech]] (TTS) model developed by [[Resemble AI]] and released in May 2025. Built on a modified [[Llama]] architecture with 500M parameters, it is marketed as the first open-source TTS model to include controllable emotion exaggeration and has gained attention for claiming to outperform established commercial systems in user preference evaluations. It is built on the [[CosyVoice|CosyVoice 2.0]] architecture. == Development and Release == Chatterbox was developed by a three-person team at Resemble AI, a voice technology company founded by Zohaib Ahmed and Saqib Muhammad.<ref>https://www.digitalocean.com/community/tutorials/resemble-chatterbox-tts-text-to-speech</ref> The initial English-only version was released in May 2025 under the [[MIT License]], followed by a multilingual version supporting 23 languages in September 2025.<ref>https://www.resemble.ai/introducing-chatterbox-multilingual-open-source-tts-for-23-languages/</ref> The project quickly gained popularity in the open-source community, accumulating over 1 million downloads on [[Hugging Face]] and more than 11,000 stars on [[GitHub]] within weeks of release.<ref name="multilingual">https://www.resemble.ai/introducing-chatterbox-multilingual-open-source-tts-for-23-languages/</ref> == Technical Architecture == Chatterbox utilizes a 500-million parameter model based on a CosyVoice-style modified Llama architecture, significantly smaller than many contemporary TTS systems. The model was trained on approximately 500,000 hours of cleaned audio data and employs what the developers term "alignment-informed inference" for improved stability during generation. Key technical features include: * '''Zero-shot voice cloning''': Ability to clone voices using as little as 5 seconds of reference audio * '''Emotion exaggeration control''': A novel parameter allowing users to adjust emotional intensity from monotone to dramatically expressive * '''Fast inference''': Sub-200ms latency for real-time applications * '''Multilingual support''': The updated version supports 23 languages including Arabic, Chinese, Hindi, and major European languages == Performance Claims and Evaluation == Resemble AI conducted a comparative evaluation through [[Podonos]], a third-party evaluation service, testing Chatterbox against [[ElevenLabs]], a leading commercial TTS system. In blind A/B testing, 63.75% of evaluators reportedly preferred Chatterbox's output over ElevenLabs.<ref>https://www.podonos.com/blog/chatterbox</ref><ref>https://www.resemble.ai/chatterbox/</ref> However, these results should be interpreted with caution, as the evaluation was limited in scope and conducted by a single third-party service. The testing methodology, sample size, and demographic composition of evaluators have not been independently verified. Additionally, the comparison was limited to a single competitor rather than a comprehensive benchmark against multiple state-of-the-art systems. == Commercial and Research Impact == The release of Chatterbox has been significant for the open-source TTS community, representing one of the first production-grade systems to be freely available under a permissive license. This has enabled developers to integrate high-quality TTS capabilities into applications without licensing costs or vendor dependencies. The system has found applications in various domains including: * Audiobook generation and voice narration * Game development for non-player character dialogue * Educational content creation * Accessibility tools for visually impaired users * Research and development in speech synthesis Resemble AI also offers a commercial "Pro" version with enhanced features, service-level agreements, and custom fine-tuning capabilities for enterprise customers requiring guaranteed performance and support. This version is available through their inference partners, such as FAL. == External Links == * [https://github.com/resemble-ai/chatterbox Official Chatterbox repository] * [https://huggingface.co/ResembleAI/chatterbox Model on Hugging Face] * [https://huggingface.co/spaces/ResembleAI/Chatterbox Interactive demo] * [https://resemble-ai.github.io/chatterbox_demopage/ Demo page with audio samples] [[Category:Speech synthesis]] [[Category:Open-source software]] [[Category:Artificial intelligence]] [[Category:Voice technology]] [[Category:MIT License software]]
Summary:
Please note that all contributions to TTS Wiki are considered to be released under the Creative Commons Attribution 4.0 (see
Project:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Template used on this page:
Template:Infobox TTS model
(
view source
) (protected)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information