Can AI clone the voice?

Reading Time: 3 minutes

Nowadays, there are lots of startups and just single entrepreneurs trying to improve society by injecting AI systems in different spheres of our life. We are already getting AI based cars, loudspeakers, mixers, etc. But, can you imagine that one day you will come home and «Alexa» will start speaking with your mother’s voice, pitch and manner of speaking? Sounds quite creepy and unusual. Someone will even say that this invention is completely useless and only has disadvantages. Actually, I wouldn’t even argue, because everything in our world has its bad sides and good sides. In the next few paragraphs I will try to introduce both of them.

Firstly, let’s start with owners of this unusual project and its principle of working. Actually, not so long time ago, a company called «Baidu» and considered to be the Chinese Google, posted full documentation of their new AI-based tool called Deep Voice that is able to simulate a usual human voice by listening to a short fragment of their speech. As written in official Baidu’s documents, this tool was made by using a neural network that analyzes a spectrogram with human voice and then generates a similar audio. Also, with the help of this network, AI learned by itself how to pronounce some difficult words and names.


When I was looking for some sources of information connected with this topic, I was thinking that this invention is strange, useless and wouldn’t find its place between other huge technology projects, but then I saw some examples of using Deep Voice and was really impressed, because it can be used in many spheres of our life. So, first and the main mission of this tool is to substitute a usual human voice. For example, person, that can’t speak for some reason, can buy such kind of technology and «regain speech» by using someone else’s natural voice, not the «Google Man». Also it can be used in various technology spheres as a voice of AI-based gadgets. Smart homes, smart cars and other AI-based inventions will be able to speak with natural voice, not specific robot sounds that often make us feel uncomfortable. Actually, you can even have some fun using Deep Voice, because this tool provides a possibility to listen to your own speech with different accents and communication manners.


As I mentioned in the first paragraph, everything has its own disadvantages and Deep Voice is not an exception. Nowadays, hacking is becoming more frequent and a tool able to copy everyone’s voice without any difficulties can be used by cybercriminals. Actually, if they properly learn how to use this invention, they will be able to perform various hacking processes easily. For example, hackers will easily capture databases or systems that are protected by voice security or fool people by using the voice of their relatives, friends, etc. Also, other individuals may try to record some bad things with voices of famous actors, singers and upload it in the internet just for fun and laugh, but no one will even think about the fact that it can seriously damage the celebrity’s reputation or even ruin it. So, in my opinion, this program must have some restrictions or other things, that won’t allow to criminals or cybercriminals use it in a bad way.

In the end, I would like to say, that AI voice copying tool is a really interesting and potentially useful invention, but it shouldn’t be launched yet because it needs some serious improvements in the field of security.

Examples of using Deep Voice:




2 thoughts on “Can AI clone the voice?

  1. Tan Peng Peng says:

    This seems to be interesting indeed! One step closer towards becoming a perfect reconstruction of a human!

    Jokes aside, it seems regulation regarding AI will have to catch up quicker since AI is making leaps and boundaries in innovations!

  2. Chepinska Sofiia says:

    Thats an interesting topic! But sure, the possibility of someone using people’s voices against them sounds creepy. Is there any tool that can detect whether the voice was modified by AI?

Leave a Reply