dmckinno
a year ago
After spending a bit more time with these models, I wrote up my findings in more detail if anyone is interested in learning more.
https://www.ddmckinnon.com/2024/10/03/dans-weekly-ai-speech-...
Item id: 41700682
a year ago
After spending a bit more time with these models, I wrote up my findings in more detail if anyone is interested in learning more.
https://www.ddmckinnon.com/2024/10/03/dans-weekly-ai-speech-...
a year ago
a year ago
This is a bit different. These audio clips use the default voice of each of these systems. I was asking about zero-shot voice cloning, i.e. transferring a recorded voice and synthesizing speech in that voice.
I tried zero-shot voice cloning in all of the top OSS models in the Arena and performance was bad.
a year ago
Most of those models DO do zero shot cloning. The best is VoiceCraft. It's nearly 11Labs quality. Check it out.
a year ago
Thanks for the flag. VoiceCraft is indeed the best ZS OSS voice cloning tool, despite appearing at the bottom of the TTS arena They have a really easy-to-use gradio demo on their repo if anyone else wants to give it a try.
There is still a big gap between 11Labs and Character.ai and the VoiceCraft voices would not be confused for the real speaker, but this is much closer.