Foreign media recently conducted tests on the video generation capabilities of the Vincent model Sora, created by OpenAI, with some interesting results. A demonstration video generated by Sora raised questions about its ability to understand the human physical world.
In the 10-second video, a parrot is seen flying through a jungle alongside monkeys. Upon closer inspection, discrepancies were found, such as twisted wings, multiple parrots generated instead of one as indicated, and a monkey appearing to have a parrot’s tail. While OpenAI acknowledged these imperfections, they also highlighted the significant improvement in video generation capabilities exhibited by Sora.
However, experts like Meta’s Yann LeCun and AI scientist Gary Marcus have pointed out that Sora still falls short in understanding the human physical world. They argue that simply generating realistic videos does not equate to true comprehension of cause and effect. Despite the advancements, Sora is far from its prime state.
Although Sora requires more time and computation to generate each video compared to its predecessor Vincentian, OpenAI has not disclosed specific details about its release to the public. Access is currently limited to select individuals, including red teamers and creative professionals, to gather feedback for further improvements.
Moreover, OpenAI spokesperson Natalie Summers emphasized the delay in Sora’s public release, citing concerns about potential misuse of hyper-realistic deepfakes in election-related contexts. With elections approaching in various regions in 2024, the company aims to mitigate security risks associated with the powerful video generation tool.
As Sora continues to undergo testing and refinement, it remains to be seen when it will be available to the public, offering a glimpse into the future of AI-driven content creation.