Voice cloning + multilingual

Your voice. Their language.

Record a short transcript and your customers will hear you on every call and every chat — in 23 languages. Powered by the ReasonVoice models we built in-house.

How it works

Record a transcript. Open the agent builder, hit Clone my voice, and read the script we provide. About 60 seconds is enough; longer samples produce a closer match.
We process it. The sample becomes a reference voice attached to your account. No extra training step on your side — the same audio drives both phone calls and the chat widget.
Pick it on any agent. Each agent has a voice setting; choose your cloned voice and a language. You can run different languages on different agents (e.g. one agent for English-speaking patients, another for Spanish), or one agent that switches based on the caller.

The audio file you upload is stored as a reference, not used for model training. We don't fine-tune on customer voices.

Recording tips

Read in a normal speaking voice — the phone tone you'd use with a customer, not announcer-mode.
Quiet room, no echo. A USB headset mic is more than enough; you don't need a studio.
One person, one sample. The clone reproduces a single speaker's timbre.
Sample length: 30 seconds works, 60 seconds is the sweet spot, beyond two minutes adds little.

Switching languages mid-conversation

The voice itself stays the same; the agent picks the language based on either:

The agent's configured language (set in the agent builder), or
What the caller is actually speaking — if you enable language auto-detect, the agent matches the caller turn-by-turn.

Auto-detect is useful for clinics and salons whose customers code- switch (English ↔ Spanish, English ↔ Hindi). Stick with a fixed language if you want predictable behaviour for a single audience.

Supported languages (23)

Every language below works on the chat widget and on phone calls — same model, same voice. ISO 639-1 codes shown for the few cases where they matter (API params, agent config).

Arabicar

Chinesezh

Danishda

Dutchnl

Englishen

Finnishfi

Frenchfr

Germande

Greekel

Hebrewhe

Hindihi

Italianit

Japaneseja

Koreanko

Malayms

Norwegianno

Polishpl

Portuguesept

Russianru

Spanishes

Swahilisw

Swedishsv

Turkishtr

Numbers, dates, currency

The ReasonVoice TTS layer normalises numbers, times, dates, and currency before speaking — so “$1,250.00” comes out as “one thousand two hundred fifty dollars” and “3:30 PM” comes out as “three thirty PM” across every language. You don't need to spell anything out in your prompts.

Limits

One cloned voice per account on Starter, up to 5 on Pro.
Voices are tied to your account, not shared across customers. Your competitor can't pick your voice from a public list.
Sample uploads accept WAV, MP3, FLAC, OGG. Max 25 MB per file.

Try it

The voice on the demo line at 1-877-9-REASON is a clone of our founder's voice, recorded the same way you would.