myshell-tts.vercel.app
Open in
urlscan Pro
76.76.21.98
Public Scan
URL:
https://myshell-tts.vercel.app/
Submission: On April 07 via api from US — Scanned from DE
Submission: On April 07 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
MyShell TTS * Home * Tone Color Cloning * Voice Style Control * Zero-Shot Cross-Lingual Cloning * Comparison with SOTA * Home * Tone Color Cloning * Voice Style Control * Zero-Shot Cross-Lingual Cloning * Comparison with SOTA MyShell TTS OPENVOICE: VERSATILE INSTANT VOICE CLONING We introduce OpenVoice, a versatile instant voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice enables granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, in addition to replicating the tone color of the reference speaker. OpenVoice also achieves zero-shot cross-lingual voice cloning for languages not included in the massive-speaker training set. OpenVoice is also computationally efficient, costing tens of times less than commercially available APIs that offer even inferior performance. The technical report and source code can be found at https://arxiv.org/pdf/2312.01479.pdf and https://github.com/myshell-ai/OpenVoice ACCURATE TONE COLOR CLONING OpenVoice can accurately clone the reference tone color and generate speech in multiple languages and accents. Reference 0:00 Generated 0:00 Generated 0:00 Reference 0:00 Generated 0:00 Generated 0:00 See more examples FLEXIBLE VOICE STYLE CONTROL OpenVoice enables granular control over voice styles, such as emotion and accent, as well as other style parameters including rhythm, pauses, and intonation. Here we demonstrate the control over emotion and accent of the generated voice. Reference 0:00 Generated - Sad 0:00 Generated - Happy 0:00 Generated - Indian Accent 0:00 Generated - British Accent 0:00 Generated - Australian Accent 0:00 See more examples ZERO-SHOT CROSS-LINGUAL VOICE CLONING The reference voice and the generated voice can be in any languages outside the massive-speaker multi-lingual dataset. We use āUā to denote the unseen languages in the following examples. Reference - English 0:00 Generated ā Mixed Lingual (U) 0:00 Generated - Japanese 0:00 Generated - Spanish (U) 0:00 Generated - German (U) 0:00 Generated - Russian (U) 0:00 See more examples COMPARISON WITH STATE-OF-THE-ARTS Reference 0:00 Generated - XTTS-v2 0:00 Generated - Valle-X 0:00 Generated - OpenVoice 0:00 Reference 0:00 Generated - XTTS-v2 0:00 Generated - Valle-X 0:00 Generated - OpenVoice 0:00 See more examples