Find answers to common questions about Dotclone’s voice AI platform, TTS, STT, and voice cloning.
Dotclone is a voice AI platform offering Text-to-Speech (TTS), Speech-to-Text (STT), and Voice Cloning services. Our neural models deliver studio-quality audio for developers, creators, and enterprises. Learn more about how our TTS, STT, and voice cloning works on our “How It Works” page.
To upload a voice sample, navigate to the Voice Cloning section of our dashboard. Click “Upload Sample” and select an audio file (MP3, WAV, AAC, or FLAC). Ensure the sample is at least 10–20 seconds long for optimal cloning accuracy.
For TTS, you can provide text in any UTF-8 encoded format (plain text, Markdown, or HTML). For STT, we accept MP3, WAV, AAC, FLAC, and OGG. For voice samples, upload MP3 or WAV format, recorded at a minimum of 16 kHz for best results.
Within the TTS or Voice Cloning workflow, after uploading your text or sample, click “Voice Settings.” There you’ll find sliders for Pitch, Speed, and Emotion. Adjust these until you achieve the desired tone. Our “Voice Settings” page provides detailed guidance on each parameter.
Keyword: voice selection guide
To find available voices, click on the Voices button. You will see a list of all available voices. Click “Use” next to the voice you wish to use.