OpenAI introduced Sora, its leading text-to-video generator, on Thursday with attractive, shockingly realistic videos showcasing the capabilities of the AI model. Sora is now available to a tiny number of researchers and creators who will test the model before wider release, which could spell disaster for the film industry and our common problem with deepfakes.
“Sora is capable of generating intricate scenes with multiple characters, specific types of movement, and true subject and background details,” OpenAI said in blog post. “The model understands not only what the user asked for in the prompt, but also how those things exist in the physical world.”
OpenAI hasn’t said when Sora will be made publicly available.
Sora is OpenAI’s first venture into AI video generation, complementary texts and AI-powered text files image generatorsChatGPT i Give her. It is unique in that it is less of a imaginative tool and more of a “data-driven physics engine,” as noted by Nvidia senior researcher Dr. Jim Fan. Sora not only generates an image, but determines the physics of the object in its environment and renders the video based on these calculations.
To generate videos with Sora, users can simply type in a few sentences as a prompt, much like AI image generators. You can choose a photorealistic or animated style, achieving shocking results in just a few minutes.
Sora is a diffusion model, which means it generates video by starting out blurry, filled with stagnant images, and slowly smoothing it out to the polished version you see below. The Midjourney and Stable Diffusion image and video generators are also diffusion models.
However, I must note that Sora from OpenAI is much better. The videos Sora creates are longer, more animated, and connect better to each other than the competition’s videos. Sora gives the impression that it is creating real videos, while the competition’s models give the appearance of frozen AI images. OpenAI has once again pioneered another field of artificial intelligence with a video generator that puts the competition to shame.
The videos produced by Sora are undeniably amazing. These videos would take hours of work by a real film crew or animators to create. Sora will likely be disruptive to the film industry in the same way that ChatGPT and AI image generators shocked the publishing and design world. It’s a technology that is both amazing and terrifying when it comes to the safety of video creators.
OpenAI says several fixes need to be worked out, including a misunderstanding of cause and effect. Sora may generate a video of a person biting a cookie, but the cookie may later have no bite mark. OpenAI also claims that the model lacks spatial awareness. May confuse left and right and not understand how a person or object interacts with the scene.
Safety is also particularly vital considering how artificial intelligence technology has been abused to create deepfakes in recent months. OpenAI says it will develop tools to lend a hand detect misleading content, as well as employ existing technologies that reject harmful text suggestions. However, given the ways in which people have circumvented the security features of current AI models, it is questionable how effective these efforts will be.
Sora is both impressive and terrifying, and it’s clear how this powerful AI video generator could disrupt the film industry and create harmful products. Imagine that Deepfake Taylor Swift there were movies. Or what if Joe Biden’s Fraudulent Call to Recent Hampshire Voters was a photorealistic message from the Oval Office? Sora is not yet publicly available, but the impact of such powerful technology precedes its launch.