How does AI use its voice?

10

October

2025

No ratings yet.

During my time on exchange last year, I have explored the vocal abilities of ChatGPT in many ways. From making songs to language learning.

Firstly, the song: I was surprised how well the AI produced and sang the whole song, and not in English, but in perfect Slovakian. We made two songs using lyrics we wrote, one rap song and one pop song. Both turned out very well and with perfect pronunciation. The AI created a proper beat and used different types of vocals during the chorus, along with several instruments. When we showed it to our friends, they couldn’t tell it apart from a normal song.

Surprised by the success, I tried to use it for some education. During my exchange in Shanghai, I have decided to take on some beginner Chinese classes, but the course turned out to be vey chaotic and unstructured. You would expect to start with basics like numbers or how to order food, but the syllabus was barely useful. Therefore, I turned to AI for help. I asked it to teach me basic Chinese, having conversations with it and generating lists of basic words that I could practice. I would tell AI to say a sentence in English, and I had to repeat it in Chinese, or load a list of words for it to practice with me.

Nevertheless, it wasn’t easy. Sometimes it would repeat the same five words from a list of thirty, even though I asked it to vary them more. I even tried to make it tell me how many times it had tested me on each word, which it would report correctly. But when I asked it to focus on the unused ones, it would always make mistakes and ignore the instructions. Hopefully, by now, its capabilities have improved. 

Maybe I should get back to it, so I can finally learn some Dutch.

Please rate this

2 thoughts on “How does AI use its voice?”

  1. I think your blog is interesting because it talks about the good capabilities of generative AI and its limitations. I think this is an important aspect to consider (I did my blog on a similar topic). I believe that generative AI is still in its early stages. While it has improved, it still doesn’t understand certain nuances and instructions. There are many times that I send ChatGPT or Dall-E simple prompts, and I don’t get the answers I was looking for. I believe that since it’s a learning-based model, it will improve over time and iterations. But that might still be a bit far off. Nevertheless, it’s still amazing to see how it can create music in different languages.

  2. I love how it could generate songs in Slovakian with proper pronunciation. This also shows the potential of AI for learning new languages.

Leave a Reply to Baraa Zemzemi Cancel reply

Your email address will not be published. Required fields are marked *