OpenAI Sora, the new AI model creates credible videos starting from a text prompt

OpenAI revealed to the public Sorahis new model of IA text-to-video. The new technology allows you to transform a text prompt with natural language descriptions into videos lasting up to 1 minute, with high realism in both visual and content terms. Not an absolute novelty, but the examples published by OpenAI are significantly more realistic compared to other text-to-video technologies widespread to date.

Your browser does not support the video tag.

Prompt: A cat waking up its sleeping owner demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics and finally the owner pulls out a secret stash of treats from under the pillow to hold the cat off a little longer.

According to what was declared by OpenAI in official siteSora can generate “complex scenes with multiple characters, specific types of movement, and fine subject and background details”. This means that the user can enter detailed text prompts and the system will be able to convert them into video clips that faithfully reflect what is described.

Sora promises realistic videos starting from text prompts and still images

Your browser does not support the video tag.

Prompt: Historical footage of California during the gold rush

For example, if you ask Sora to generate a video set in gold rush-era California, the technology will produce a realistic aerial scene of a landscape typical of that historical period, complete with coherent characters, actions and details. Similarly, if you ask to recreate imaginative situations or with very specific cinematic stylesSora will generate a credible video based on the user’s request.

Your browser does not support the video tag.

Prompt: The camera directly faces colorful buildings in burano italy. An adorable dalmation looks through a window on a building on the ground floor. Many people are walking and cycling along the canal streets in front of the buildings.

Your browser does not support the video tag.

Prompt: An extreme close-up of an gray-haired man with a beard in his 60s, he is deep in thought pondering the history of the universe as he sits at a cafe in Paris, his eyes focus on people offscreen as they walk as he sits mostly motionless, he is dressed in a wool coat suit coat with a button-down shirt , he wears a brown beret and glasses and has a very professorial appearance, and the end he offers a subtle closed-mouth smile as if he found the answer to the mystery of life, the lighting is very cinematic with the golden light and the Parisian streets and city in the background, depth of field, cinematic 35mm film.

Your browser does not support the video tag.

Prompt: A white and orange tabby cat is seen happily darting through a dense garden, as if chasing something. Its eyes are wide and happy as it jogs forward, scanning the branches, flowers, and leaves as it walks. The path is narrow as it makes its way between all the plants. the scene is captured from a ground-level angle, following the cat closely, giving a low and intimate perspective. The image is cinematic with warm tones and a grainy texture. The scattered daylight between the leaves and plants above creates a warm contrast, accentuating the cats orange fur. The shot is clear and sharp, with a shallow depth of field.

The system is based on sophisticated neural networks that allow it to understand the laws of physics and how objects exist and interact in the real world (net of some inaccuracies, as the company itself admits). therefore able to position subjects in scenes and make them move in a natural and convincing way. Furthermore, Sora knows how to precisely recreate both objects and characters, without neglecting emotions and somatic features. In addition to recognizing text, Sora can also generate text video clips from still imagesor complete existing video clips, for example by adding missing frames or extending the length of the movie.

Your browser does not support the video tag.

Prompt: Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.

Your browser does not support the video tag.

Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Sora is currently only available to a small group of “red teamer”, that is, researchers who evaluate risks and potential negative implications, and to some artists to obtain feedback. It is therefore not accessible to the general public, although OpenAI has not ruled out a possible future wider release. The launch of Sora follows that of DALL-E 3, the proprietary technology for generating images from text. Both tools promise to revolutionize the way in which digital content is created and consumed, although they raise strong concerns about possible harmful uses and the ethical challenges raised by the creation of increasingly credible and sophisticated digital material. Precisely for this reason OpenAI is proceeding with caution, limiting access to a few selected experts.

OpenAI Sora the new AI model creates credible videos starting from a text prompt

OpenAI Sora, the new AI model creates credible videos starting from a text prompt

Sora promises realistic videos starting from text prompts and still images

Share this:

Related

In Greece, celebration for the approval of the law on same-sex marriage

A comparison of the skier’s two best seasons

You may also like

Leave a Comment Cancel Reply