ChatTTS: Multi-Language Text-to-Speech System for Natural Dialogue Synthesis in English and Chinese

ChatTTS is a text-to-speech (TTS) system designed for conversational scenarios, supporting both English and Chinese languages. It is especially suitable for dialogue tasks in large language model (LLM) assistants and applications like conversational audio and video introductions. Trained on an extensive dataset of approximately 100,000 hours of data, ChatTTS delivers high-quality and natural-sounding speech synthesis.

Key Features:

Multi-Language Support: ChatTTS supports both English and Chinese, catering to a diverse user base and bridging language barriers.
Extensive Training Data: With around 10 million hours of training data, ChatTTS provides high-quality voice synthesis that sounds natural.
Dialog Task Optimization: It is optimized for dialogue tasks, generating conversational responses that enhance user interaction when integrated into various applications.
Open Source Availability: The project team plans to release a trained base model as open-source, allowing researchers and developers to further study and develop the technology.
Control and Security Measures: There are ongoing efforts to improve model controllability, add watermarks, and integrate with LLMs to ensure safety and reliability.
User-Friendly Experience: ChatTTS requires only text input to generate corresponding voice files, making it easy for users with speech synthesis needs.

data statistics

Relevant Navigation

Charisma

Easily connect to UnrealEngine, Unity, mobile devices and metaverses

coqui.ai

Clone your voice in seconds or choose from our available AI voices

Fliki

AI video generation

Seed-TTS

Seed-TTS is a high-quality, versatile speech generation model that can generate speech that is almost indistinguishable from human speech and supports features such as emotion control and speaker fine-tuning.

TTS MARKER

Text to language tool

Chatbase

Customized AI intelligent customer service