Back to docs
Voice cloning + multilingual

Your voice. Their language.

Record a short transcript and your customers will hear you on every call and every chat — in 23 languages. Powered by the ReasonVoice models we built in-house.

How it works

  1. Record a transcript. Open the agent builder, hit Clone my voice, and read the script we provide. About 60 seconds is enough; longer samples produce a closer match.
  2. We process it. The sample becomes a reference voice attached to your account. No extra training step on your side — the same audio drives both phone calls and the chat widget.
  3. Pick it on any agent. Each agent has a voice setting; choose your cloned voice and a language. You can run different languages on different agents (e.g. one agent for English-speaking patients, another for Spanish), or one agent that switches based on the caller.
The audio file you upload is stored as a reference, not used for model training. We don't fine-tune on customer voices.

Recording tips

  • Read in a normal speaking voice — the phone tone you'd use with a customer, not announcer-mode.
  • Quiet room, no echo. A USB headset mic is more than enough; you don't need a studio.
  • One person, one sample. The clone reproduces a single speaker's timbre.
  • Sample length: 30 seconds works, 60 seconds is the sweet spot, beyond two minutes adds little.

Switching languages mid-conversation

The voice itself stays the same; the agent picks the language based on either:

  • The agent's configured language (set in the agent builder), or
  • What the caller is actually speaking — if you enable language auto-detect, the agent matches the caller turn-by-turn.

Auto-detect is useful for clinics and salons whose customers code- switch (English ↔ Spanish, English ↔ Hindi). Stick with a fixed language if you want predictable behaviour for a single audience.

Supported languages (23)

Every language below works on the chat widget and on phone calls — same model, same voice. ISO 639-1 codes shown for the few cases where they matter (API params, agent config).

Arabicar
Chinesezh
Danishda
Dutchnl
Englishen
Finnishfi
Frenchfr
Germande
Greekel
Hebrewhe
Hindihi
Italianit
Japaneseja
Koreanko
Malayms
Norwegianno
Polishpl
Portuguesept
Russianru
Spanishes
Swahilisw
Swedishsv
Turkishtr

Numbers, dates, currency

The ReasonVoice TTS layer normalises numbers, times, dates, and currency before speaking — so “$1,250.00” comes out as “one thousand two hundred fifty dollars” and “3:30 PM” comes out as “three thirty PM” across every language. You don't need to spell anything out in your prompts.

Limits

  • One cloned voice per account on Starter, up to 5 on Pro.
  • Voices are tied to your account, not shared across customers. Your competitor can't pick your voice from a public list.
  • Sample uploads accept WAV, MP3, FLAC, OGG. Max 25 MB per file.

Try it

The voice on the demo line at 1-877-9-REASON is a clone of our founder's voice, recorded the same way you would.