Name: Dotclone
Author: Dotclone

Overview

Dotclone's Text-to-Speech (TTS) engine lets you convert text into human-like speech instantly. Whether you're building voice agents, creating audio content, or adding speech to your app, TTS provides:

Multiple Providers

Choose from ElevenLabs, MiniMax, OpenAI, and more

Voice Cloning

Clone any voice with 10-30 seconds of audio

Emotion & Enhancement

Add emotions, whispers, laughs, and more

50+ Languages

Generate speech in multiple languages and accents

The Text-to-Speech interface in Dotclone

How to Generate Speech

Follow these steps to generate your first speech audio:

1

Enter Your Text

Type or paste the text you want to convert.

Navigate to Text-to-Speech in the sidebar. Enter your text in the large input area. You can enter anything from a single sentence to multiple paragraphs.

Enter your text in the input area

2

Select a Provider

Choose your preferred TTS provider.

Click on the provider dropdown and select from the available options. Each provider has unique voice qualities and features.

Choose your TTS provider

3

Choose a Voice

Select a voice that fits your needs.

Browse the available voices for your selected provider. You can preview voices before selecting. Each provider has its own set of built-in voices.

Browse and select a voice

Preview a voice before selecting

4

Generate & Download

Create your audio and save it.

Click the Generate button. Once complete, you can play the audio, download it, or share it.

Click Generate to create your audio

Download or share your generated audio

Writing Better Text for TTS

The quality of your output depends heavily on how you write your input text. Here are tips for better results:

Use Punctuation

Proper punctuation creates natural pauses. Use commas, periods, and ellipses for pacing.

✓ "Hello, welcome to Dotclone. How can I help you today?"

Break Long Sentences

Short sentences sound more natural. Break complex sentences into smaller parts.

✗ "Our company which was founded in 2020 provides AI voice solutions for businesses of all sizes." ✓ "Our company was founded in 2020. We provide AI voice solutions for businesses of all sizes."

Spell Out Abbreviations

Write out abbreviations and acronyms for proper pronunciation.

✗ "Call our HQ ASAP" ✓ "Call our headquarters as soon as possible"

Use Phonetic Spelling

For unusual names or words, spell them phonetically.

✓ "Welcome to Dotclone (dot-clone)"

Example of well-formatted text for TTS

Providers

Dotclone integrates with multiple TTS providers. Each provider has its own voices, models, and settings. Choose based on your quality, speed, and cost requirements.

Provider	Models	Best For	Features
ElevenLabs Recommended	`eleven_v3` `eleven_multilingual_v2` `eleven_turbo_v2_5` `eleven_flash_v2_5`	Highest quality, expressive voices	Emotion enhancement, voice cloning
MiniMax	`speech-2.8-turbo` `speech-2.8-hd` `speech-2.6-turbo` `speech-2.6-hd`	Emotion brackets, sound effects	Background noise, emotion tags
OpenAI	`tts-1` `tts-1-hd` `gpt-4o-mini-tts`	Fast, reliable, good quality	Speed control, multiple voices

Select a provider from the dropdown

Each provider has multiple models to choose from

Provider-Specific Voices

Each provider has its own built-in voices. When you switch providers, the voice list will update to show only voices available for that provider.

Voices

Each provider offers a unique set of built-in voices. Browse, preview, and select the voice that best fits your use case.

Built-in Voices

Built-in voices are pre-trained voices provided by each TTS provider. They cover various genders, accents, ages, and speaking styles.

Browse available voices for your selected provider

View voice details including language and style

Selecting a Voice

Click on any voice to select it. You can preview the voice by clicking the play button next to each voice name before making your selection.

Click to select a voice

Voice Cloning

Clone any voice with just 10-30 seconds of audio. Cloned voices can be used with any TTS provider in Dotclone.

Cross-Provider Compatibility

Once you clone a voice, it becomes available across all providers. Select any provider, and your cloned voice will appear in the voice list alongside built-in voices.

How to Clone a Voice

1

Go to Voice Cloning

Navigate to the Voice Cloning section.

Click Voice Cloning in the sidebar

2

Upload Audio Sample

Upload 10-30 seconds of clear audio.

For best results, use audio that is:

Clear speech without background noise
Single speaker only
10-30 seconds in length
Natural speaking pace (not reading)

Upload your audio sample

3

Name Your Voice

Give your cloned voice a descriptive name.

Enter a name for your cloned voice

4

Clone & Use

Process the voice and start using it.

Click Clone Voice to process. Once complete, your cloned voice will appear in the voice dropdown when using TTS.

Your cloned voice is ready to use

Select your cloned voice in the TTS interface

Deleting a Cloned Voice

To delete a cloned voice, go to Voice Cloning, find the voice you want to remove, and click the Delete button.

Click the delete button to remove a cloned voice

Deletion is Permanent

Once deleted, a cloned voice cannot be recovered. You'll need to upload a new audio sample to create it again.

Enhance & Emotion

Add emotions, expressions, and style to your generated speech. Make your audio more engaging and natural-sounding.

Supported Models

Emotion enhancement is only supported by these models:

ElevenLabs: eleven_v3
MiniMax: speech-2.8-turbo, speech-2.8-hd

Using Emotion Brackets

Add emotion tags in square brackets directly in your text. The TTS engine will interpret these and apply the corresponding emotion to the speech.

Syntax

[emotion] Your text here

Examples

[excited] Wow, this is amazing! [laughs] I can't believe it worked! [whispers] Don't tell anyone...

[sad] I'm sorry to hear that happened.

[angry] This is completely unacceptable!

[cheerful] Good morning! How can I help you today?

Available Emotion Tags

[excited] [sad] [angry] [cheerful] [whispers] [laughs] [sighs] [surprised] [nervous] [calm] [serious] [friendly]

Add emotion brackets directly in your text

Use the Enhance button for automatic emotion suggestions

Provider Settings

Each provider and model has specific settings you can adjust to fine-tune your audio output.

ElevenLabs Settings

`eleven_v3`

Setting	Range	Description
Stability	0% - 100%	Higher = more consistent, Lower = more expressive

`eleven_multilingual_v2`, `eleven_turbo_v2_5`, `eleven_flash_v2_5`

Setting	Range	Description
Speed	0.5x - 2.0x	Playback speed of the generated audio
Stability	0% - 100%	Voice consistency vs expressiveness
Clarity	0% - 100%	Clarity and enhancement of the voice

MiniMax & OpenAI Settings

`speech-2.8-turbo`, `speech-2.8-hd`, `speech-2.6-turbo`, `speech-2.6-hd`, `tts-1`, `tts-1-hd`, `gpt-4o-mini-tts`

Setting	Range	Description
Speed	0.5x - 2.0x	Playback speed
Pitch	-12 to +12	Voice pitch (higher or lower)
Intensity	0% - 100%	Emotional intensity
Timber	0% - 100%	Voice warmth and tone color

Adjust provider-specific settings in the settings panel

Use sliders to fine-tune your audio output

Sound Effects & Background Noise

Add ambient sounds and background noise to make your audio more immersive. This feature is available for MiniMax provider only.

MiniMax Only

Background noise and sound effects are only available when using MiniMax as your TTS provider.

Available Background Sounds

Choose from a variety of ambient sounds to mix with your generated speech:

🏢 Office Ambience

☕ Coffee Shop

🚗 Traffic / Street

🌧️ Rain

🌊 Ocean Waves

🌲 Forest / Nature

✈️ Airplane Cabin

🏠 Home Interior

Select a background sound from the dropdown

Adjust the mix level between speech and background

Supported Languages

Dotclone TTS supports over 50 languages and accents across all providers. Language availability may vary by provider and voice.

Popular Languages

English (US, UK, AU)
Spanish (ES, MX)
French (FR, CA)
German
Italian
Portuguese (BR, PT)
Japanese
Korean
Chinese (Mandarin)
Arabic

European Languages

Dutch
Polish
Russian
Swedish
Norwegian
Danish
Finnish
Greek
Czech
Romanian

Asian Languages

Hindi
Thai
Vietnamese
Indonesian
Malay
Filipino
Tamil
Bengali
Urdu
Turkish

Select language when choosing a voice

Output Formats

Choose your preferred audio format when generating or downloading speech.

Format	Extension	Best For	Quality
MP3	.mp3	General use, web, sharing	Good (compressed)
WAV	.wav	Professional audio, editing	Highest (uncompressed)
OGG	.ogg	Web applications, games	Good (compressed)
FLAC	.flac	Archival, lossless storage	Highest (lossless)

Select your preferred output format before downloading

TTS History

Every speech you generate is automatically saved to your history. Access, replay, or download any previously generated audio.

Accessing Your History

Click the History tab in the TTS interface to view all your previously generated audio files.

Click the History tab to view past generations

Browse your generated audio history

History Features

Replay: Listen to any previous generation
Download: Download the audio file again
Copy Text: Copy the original text used
Regenerate: Generate again with same or different settings
Delete: Remove from history

Actions available for each history item

Storage Limits

History storage depends on your plan. Free accounts store the last 50 generations. Paid plans have higher or unlimited storage.

API Usage

Generate speech programmatically using the Dotclone API or SDK.

Basic TTS Generation

from dotclone import Dotclone

client = Dotclone()

# Generate speech
audio = client.tts.generate(
    text="Hello! Welcome to Dotclone.",
    provider="elevenlabs",
    model="eleven_v3",
    voice="emma",
    settings={
        "stability": 75
    }
)

# Save the audio
audio.save("welcome.mp3")

# Or get the URL
print(audio.url)

With Emotion Enhancement

# Using emotion brackets
audio = client.tts.generate(
    text="[excited] Wow, this is amazing! [laughs] I love it!",
    provider="elevenlabs",
    model="eleven_v3",
    voice="emma"
)

audio.save("excited_speech.mp3")

Using a Cloned Voice

# Use your cloned voice
audio = client.tts.generate(
    text="This is my cloned voice speaking.",
    provider="elevenlabs",
    voice="my-cloned-voice-id"  # Your cloned voice ID
)

audio.save("cloned_voice.mp3")

Basic TTS Generation

import Dotclone from 'dotclone';

const client = new Dotclone();

// Generate speech
const audio = await client.tts.generate({
  text: "Hello! Welcome to Dotclone.",
  provider: "elevenlabs",
  model: "eleven_v3",
  voice: "emma",
  settings: {
    stability: 75
  }
});

// Get the audio URL
console.log(audio.url);

// Or download
await audio.download("welcome.mp3");

With Emotion Enhancement

// Using emotion brackets
const audio = await client.tts.generate({
  text: "[excited] Wow, this is amazing! [laughs] I love it!",
  provider: "elevenlabs",
  model: "eleven_v3",
  voice: "emma"
});

console.log(audio.url);

Basic TTS Generation

curl -X POST "https://api.dotclone.com/v1/tts/generate" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello! Welcome to Dotclone.",
    "provider": "elevenlabs",
    "model": "eleven_v3",
    "voice": "emma",
    "settings": {
      "stability": 75
    }
  }'

Response

{
  "id": "tts_abc123",
  "url": "https://cdn.dotclone.com/audio/tts_abc123.mp3",
  "duration": 2.5,
  "format": "mp3",
  "created_at": "2025-01-15T10:30:00Z"
}

For complete API documentation, see the TTS API Reference.

Related Features

Explore more Dotclone features:

Voice Cloning

Clone any voice in minutes

AI Agents

Use TTS voices in your agents

Background Noise

Add ambient sounds to audio

Text-to-Speech

Overview

Multiple Providers

Voice Cloning

Emotion & Enhancement

50+ Languages

How to Generate Speech

Enter Your Text

Select a Provider

Choose a Voice

Generate & Download

Writing Better Text for TTS

Use Punctuation

Break Long Sentences

Spell Out Abbreviations

Use Phonetic Spelling

Providers

Voices

Built-in Voices

Selecting a Voice

Voice Cloning

How to Clone a Voice

Go to Voice Cloning

Upload Audio Sample

Name Your Voice

Clone & Use

Deleting a Cloned Voice

Enhance & Emotion

Using Emotion Brackets

Syntax

Examples

Available Emotion Tags

Provider Settings

ElevenLabs Settings

eleven_v3

eleven_multilingual_v2, eleven_turbo_v2_5, eleven_flash_v2_5

MiniMax & OpenAI Settings

speech-2.8-turbo, speech-2.8-hd, speech-2.6-turbo, speech-2.6-hd, tts-1, tts-1-hd, gpt-4o-mini-tts

Sound Effects & Background Noise

Available Background Sounds

Supported Languages

Popular Languages

European Languages

Asian Languages

Output Formats

TTS History

Accessing Your History

History Features

API Usage

Basic TTS Generation

With Emotion Enhancement

Using a Cloned Voice

Basic TTS Generation

With Emotion Enhancement

Basic TTS Generation

Response

Related Features

Voice Cloning

AI Agents

Background Noise

`eleven_v3`

`eleven_multilingual_v2`, `eleven_turbo_v2_5`, `eleven_flash_v2_5`

`speech-2.8-turbo`, `speech-2.8-hd`, `speech-2.6-turbo`, `speech-2.6-hd`, `tts-1`, `tts-1-hd`, `gpt-4o-mini-tts`