Microsoft unveils Vasa-1, the artificial intelligence that creates videos starting from an image (or a painting): the disturbing rap of the Mona Lisa

Microsoft Research laboratories have revealed an Artificial Intelligence that creates videos with realistic audio. But it is not accessible for safety. Deep Fake risk increases

When last February OpenAI revealed Sora’s abilities to the world, once again amazed (but also worried) by the impressive capabilities that their models can achieve. If it was the cinematic world that was worried about Sora’s potential then, it is now that generates amazement and fear Vasa-1. It was presented by the laboratories of the Asia division of Microsoft Research. The differences with the service of the company headed by Sam Altman are several. If Sora can generate a video from scratch with a textual indication, i.e. a prompt, Vasa-1 can produce a video from an image, such as a photograph. Or a painting.

It raises a smile, but there is something disturbing about the example of a video that quickly went around the web and that «soul» Leonardo da Vinci’s Mona Lisa. We see her come to life as she sings actress Anne Hathaway’s viral 2011 rap, «Paparazzi», with extreme and – yes – decidedly realistic expressions.

The model can also “clone” a person’s voice, using even a few seconds of recording of the original voice as a source. On the official website, Microsoft explains in detail what the Vasa framework can do. The examples shared were created by first generating the faces from scratch, through the use of AI such as Dall-E 3. Therefore none of the faces shown belong to a real person. Afterwards vocal models were used for creating voices. The generated videos can have a resolution of 512 x 512 pixels at 45 fps (frames per second). However, online it can be enjoyed at 40 fps.

The tool is capable of producing videos with appreciable lip-audio sync. But it can “also capture a broad spectrum of facial nuances and natural head movements that contribute to the perception of authenticity and liveliness”. If the Redmond company has cited the possibilities of using Vasa-1 technology for virtual avatars driven by Artificial Intelligence, on the other hand there is a real danger of an increase in the already numerous cases of deep fakes. The US giant wanted to reassure that there is no possibility of using this technology, at least until responsible use is possible. Until then, Microsoft does not intend to share details on how to take advantage of the capabilities of the new model. «Videos generated with this method still contain identifiable artifacts and numerical analysis shows that there is still a gap to reach the authenticity of real videos» explain the researchers, on the official Vasa-1 website.

Deep fakes in Bollywood

But if it is true that the company had the scruple to inhibit the possibility of taking advantage of such a powerful technology, the same cannot be said of other platforms. Some Bollywood actors can confirm this, who saw their faces victims of deepfakes for electoral purposes. The image of two of the most famous Indian actors have in fact been exploited for creating deceptive videos, which quickly went viral. Deep fakes criticized Prime Minister Narendra Modi. The criminals also used the actors’ influence to urge the population to vote for the opposition. Two videos lasting just a few seconds, but which were enough to go viral and create havoc on social media platforms. Second Reutershave been viewed more than half a million times since last week .

Elections in India are quite complicated. Nearly 1 billion voters have the right to vote. Voting began last Friday and will continue until June. With a large impressionable audience and plenty of time to circulate, the videos were shared, viewed and reshared on Facebook and X, including by some members of Congress, leading to widespread misinformation. Despite the denial by the actors concerned regarding the authenticity of the video and the activity of social platforms intent on removing false content, there are those who not only shared and kept the video, but went rather further. In southern India, Congress leader Vijay Vasanth asked the team to create a (fake) video of his father rallying for his son. It’s a shame that the father is long deceased, but his popularity is higher than that of the heir.

In Oscar winner Christopher Nolan’s film, Inceptionthe protagonist played by Leonardo di Caprio, Cobb, to distinguish reality from the fiction of the dream world he used a metal spinning top. The totem, as defined in the film, if in perpetual movement let di Caprio understand that it was not in the real world. What will be the totem of the population if deep fakes become increasingly credible and candidates for a handful more votes are willing to share fake videos of celebrities and the deceased?

April 23, 2024 (changed April 23, 2024 | 4:30 pm)

Tags: Microsoft unveils Vasa1 artificial intelligence creates videos starting image painting disturbing rap Mona Lisa

For Latest Updates Follow us on Google News

Deep fakes in Bollywood

Related posts