From text to video, how AI creates a video based on text prompts?

Reading Time: 2 minutes

Whilst listening to the “All-in” podcast[1], which is recorded by four billionaires who are discussing latest world economic and tech news, I’ve heard about the newest Meta’s invention. It’s called Make-A-Video. The program uses artificial intelligence to create short videos (up to 15 seconds) based on text prompts. Previously we have seen programs like DALL-E which generates an image based on text, however it has never been as advanced. Meta also allows us to create videos based on an image or a video to generate a similar video. Make-A-Video uses previously made images with description to find out how the world looks like, moreover according to Meta AI blog[2] “It uses unlabeled videos to learn how the world moves”. Here are some examples of the Make-A-Video works of art.

Despite Make-A-Video being a revolutionary invention, it also raises some ethical questions. Namely, a report[3] prepared on Ars Technica concerns about using commercial data which were taken without permission for a commercial AI product according to Simon Willison research[4]. It may also raise questions about the future for human artists such as painters or filmmakers, because as we may see, the “art” might be created just by prompting some words.

At the very beginning, those programs (also like DALL-E) may look like a funny and useful tool for creating content and helping creatives. However, they can also cause some serious threats. Images or films generated by those programs can contribute to causing harms; for instance, reinforcing racial or gender negative behaviors and stereotypes. Another threat is the creation of disinformation or usage of this content for harassment. I would like to quote the words of Wael Abd-Almageed, a professor at the University of Southern California, “Historically, people trust what they see. Once the line between truth and fake is eroded, everything will become fake. We will not be able to believe anything”. As we see, those fake news also known as deepfakes – “a broad term that covers and AI-synthesized media”[5], could make people stop believing in anything, which leads to lostness, misunderstanding and sometimes even fear since we cannot find any reliable source of information. Here you can find an example of that: according to Pew Research study[6] about two-thirds of Americans surveyed said altered videos and images had become a major problem for understanding the basic facts of current events. Moreover, more than a third said “made-up news” had led the, to reduce the amount of news they got overall.

Obviously, the above given opinion is only a possible scenario, nevertheless this technology is developing quickly, and tech companies cannot quickly create norms around use of those programs in order to prevent any negative outcomes.

Let me know what do you think about Make-a-video and if you have any other threats which can be developed by this technology.

Sources:

CDO Trends https://www.cdotrends.com/story/17228/meta-unveils-text-video-ai-generator?refresh=auto

Euronews.next https://www.euronews.com/next/2022/10/04/meta-unveils-ai-tool-that-creates-gif-like-videos-from-text-prompts

The Washington Post https://www.washingtonpost.com/technology/interactive/2022/artificial-intelligence-images-dall-e/


[1] https://open.spotify.com/show/2IqXAVFR4e0Bmyjsdc8QzF

[2] https://ai.facebook.com/blog/generative-ai-text-to-video/

[3] https://arstechnica.com/information-technology/2022/09/write-text-get-video-meta-announces-ai-video-generator/

[4] https://twitter.com/simonw/status/1575555436085846016

[5] https://www.washingtonpost.com/technology/interactive/2022/artificial-intelligence-images-dall-e/

[6] https://www.pewresearch.org/journalism/2019/06/05/many-americans-say-made-up-news-is-a-critical-problem-that-needs-to-be-fixed/

Tagged

Leave a Reply