The Future of Speech Synthesis: Harnessing the Power of Generative AI

5

October

2023

No ratings yet.

One such remarkable innovation is the use of generative AI tools to convert text into speech. An example for a company that uses this technique is Eleven Labs. It uses an algorithm that analyses the conceptual aspects in a text. Looking for emotions, to enable the system to get the writers sentiment. Which will lead to a more humanlike sound. It has a voice library with unique voice profiles, that can also be create with their Voice Design Tool. Voice profiles can also be generated with a short voice sample. This means you can generate your own voice too.

I tried this for myself with a short sound sample. Now I could type in any text and make my own voice say it. For me it sounded a bit strange, and I felt like you could hear a bit of a robotic sound. However, for me it is always weird to hear my own voice in a recording or video. I tried sending a voice message to my mother and she did not mention anything and just responded how she normally would.

The fact that this technology is at such high quality already is only going to improve even more makes for a lot of opportunities. Practical applications of Eleven Labs are for example audiobook production, voice assistants or podcast production.

Challenges and Ethical Considerations

While generative AI for text-to-speech offers multiple benefits, it also raises important ethical concerns. Issues related to privacy, misuse of technology, and the potential for deepfake audio are topics that need to be  careful discussed. Immediately after the Dutch language was introduced on Eleven Labs, people like the king’s and prime minister’s voices were cloned. When used in the wrong way, the use of Eleven Labs could be harmful. Especially when not, everyone is up to speed on these possibilities and will just believe everything these created voices say.

Please rate this

Leave a Reply

Your email address will not be published. Required fields are marked *