AI generated voices: creepy and dangerous or impressive and practical? (using PlayHT)

30

September

2023

5/5 (1)

When asking in my surroundings what the first thing that comes to their mind when thinking about generative AI, I get mostly the answer of chatbots (e.g. ChatGPT) and occasionally generating images (e.g. DALL-E). These types of generative AI have very practical implications, for both individuals as businesses. For instance, generative AI could enhance creativity by using platforms like ChatGPT and Stable Diffusion (Eapen et al., 2023). However, generative AI goes much further than those examples: one of them is generative AI voice. I’ve used the PlayHT platform, which is free, to test my own AI created voice.

I came across PlayHT through a YouTube video about generated AI voice, the results were quite good actually. So I had to test my own AI generated voice. I created a free account on the PlayHT platform after which I could upload a video file of at least 30 seconds of my own voice. I simply read a random Wikipedia page and recorded my voice on my telephone. After uploading, I only had to wait for 30 seconds and my voice was ‘cloned’ (the term PlayHT uses for my own AI generated voice). After that, I just entered a couple sentences and the results were indeed quite good! Although you can definitely hear that it’s not my real voice, there are some similarities. The fact that I only uploaded a voice recording of roughly 30 seconds and gave me those results, were very impressive but also a little bit creepy. For you as a viewer, copy-pasted the whole text in PlayHT with my own voice:

This made me think, does AI generated voice have real practical applications or can it be dangerous and do we have to be careful with this type of technology? Some great useful applications are for voice-overs. Narration plays a significant role in the media and entertainment sector where the voice in one of the most important elements, for example in the case of advertisements. Generative AI voice could replace voice actors and could also increase the amount of voice-overs.

On the other hand, AI generated voices could lead to negative effects when used malicious. A while ago, deepfake was all over the news. With deepfake, sometimes we can’t tell the difference between original video’s and deepfakes. Deepfake sometimes use real voice actors, but with the increase and improvements of generative AI, deepfake could be used even more malicious. Fortunately, there’s a quite extensive research on how to detect deepfakes. Fortunately, according to Rana et al. (2022), deep learning techniques are effective in detecting deepfake. Although it is questionable whether these kinds of detection systems can keep up with current generative AI developments.

I think that AI generated voice is still quite unknown, so there are probably a lot more practical implications which are not used today.

What’s your point of view of AI generated voices? Let me know!

Eapen, T. T., Finkenstadt, D. J., Folk, J., & Venkataswamy, L. (2023). How Generative AI Can Augment Human Creativity. (cover story). *Harvard Business Review*, *101*(4), 56–64.

Rana, M. S., Nobi, M. N., Murali, B., & Sung, A. H. (2022). Deepfake detection: a systematic literature review. *Ieee Access*, *10*. https://doi.org/10.1109/ACCESS.2022.3154404

Please rate this

3 thoughts on “AI generated voices: creepy and dangerous or impressive and practical? (using PlayHT)”

  1. This is very interesting! Doubling down on your idea about narrations, I wonder if in the near future you could simply use AI to create entire YouTube videos narrated using your voice without anyone being able to tell that it’s not actually you. Furthermore, in terms of voice actors, what are your thoughts in terms of the interpretation itself? If for example we want to have Leonardo DiCaprio as the voice of the Joker in a new animated Batman movie, while technology can certainly replicate his voice, that still leaves the aspect of the creative contribution that the actor would bring and how he would interpret the Joker’s lines. I can imagine that any approximation of this by AI can only be based on the work of previous actors and hence be more derivative than original.

    1. Thanks for your comment! As for the voice actor example; this would make it very challenging for AI generated voice to compete with such skills (the creative contribution). For this reason, I think generative AI won’t replace voice actors (soon) in any movie. On the other hand, I do think that generative AI could be beneficial for YouTube videos. The level of expertise and creativity is lower (mostly) than in the billion-movie industry.

  2. This subject is very interesting to read about! I haven’t had any experience with AI generated voices. However, while the possibilities are endless for the entertainment industry, I think it might also be a little creepy (as you stated). I am instantly thinking about how the AI tool can imitate voices, and create “disturbing messages” from influential people. For example, a politician or a singer. And, as you can imagine, this can cause very dangerous situations.
    However, as I stated before, the entertainment industry can benefit a lot from it, as the comment above is explaining. So, I think it depends whether the tool is creepy or impressive. When it is in the wrong hands, it can be detrimental……….

Leave a Reply

Your email address will not be published. Required fields are marked *