To demonstrate the capabilities of Geminiits most “powerful” AI, Google has produced a series of videos with practical demonstrations.
In one of these demos, the new artificial intelligence interacts with a human being in an amazing way. Gemini “observes” the movements of a man’s hands filmed by a video camera. And he comments brilliantly – with a virtual voice and with text – everything that happens: he understands the trick of a “magic trick”, corrects the order of the planets starting from the Sun, predicts which type of car will go faster downhill and understands which scene from a film is miming a person rotating the arms leaning backwards (for the record: The Matrix).
How Gemini works, Google’s new AI that interacts with humans
In the video Gemini does all this in real timewith a speed and accuracy that actually leaves you speechless.
It was equally shocking to discover – only a few hours after the official launch of the new AI – that that video, which has since gone viral, is fake. Or rather: Gemini is probably able to interpret human actions and give answers identical to those provided in the six-minute demo created by Google.
But the problem is that can’t (yet) do it exactly that way.
It was immediately clear from the description accompanying the demo on YouTube, which states that “latency [vale a dire il tempo che intercorre tra le azioni dell’uomo e le risposte di Gemini, nda] has been shortened and Gemini’s responses have been shortened due to timing issues.”
Artificial intelligence Here is Gemini, Google’s most powerful AI: it includes physics and mathematics by Pier Luigi Pisa 06 December 2023
In short, Gemini is not capable of responding in real time and with the speed that makes the video created by Google so special.
A Google spokesperson told Bloomberg that the Gemini demo was built starting from conversations between humans and AI starting from “images extracted from the video and textual commands”.
It seems, in short, that Google engineers first shot the six-minute video and then extrapolated a series of frames from this. The selected images, corresponding to different actions carried out by the man protagonist of the clip, were subjected to Gemini together with a prompti.e. a text command that invites the AI to generate certain content: it can be a text, but also an audio or an image.
In the end, Google “gave voice” to Gemini’s answers, inserted them into the demo and speeded everything up. By doing so, he gave the impression that the interaction between AI and human being occurs in an extremely natural way and above all in real time.
Google’s demo, in fact, is a promise. An all too spectacular way of announcing the future that awaits us. A future in which probably no one will be alone anymore. And in which people will probably get to the point of forming a deep bond with an artificial intelligence. As happens to the protagonist of Herthe 2013 Spike Jonze film.
Gemini and the future of solitude by Riccardo Luna 07 December 2023
Many would like to believe in those six “movie-like” minutes. All those, for example, who welcomed with great enthusiasm AI Pin by Humane, the small device that attaches to clothes – like a pin – and which can count on a generative AI that “observes” the world thanks to an integrated video camera. An AI that can comment on what it “sees” thanks to a virtual voice. Gadget Humane AI Pin, the smartphone without display that allows you to wear artificial intelligence by Pier Luigi Pisa 09 November 2023
Here, try to imagine a future of this type, in which similar devices will accompany man and AI like the one shown in Google’s demo will offer information, but also witty jokes, in real time.
Oriol Vynialshead of research and deep learning at Google DeepMindthe team that develops the most advanced AI at Google, he wrote on the social network that “all suggestions and user output in the video are real, shortened to save time” and that “the video illustrates what multimodal experiences built with Gemini could look like.”
“We made it to inspire developers,” added Vynials. And in fact on the blog dedicated to developers Google makes no secret of the (unedited) prompts and images used to obtain Gemini’s answers to be included in her demo.
And then the question arises: Google, why did you do this?
Even if behind the six-minute clip there was a disenchanted good faith, and the best of intentions – “to inspire developers”, precisely – Big G’s strategy risks damaging its own technology.
If there’s one thing we’ve learned using generative AI, it’s that you should never blindly trust what he writes. Because this type of AI, which can express itself like a human being, sometimes suffers from “hallucinations”: some of your answers, apparently plausible and coherent, may actually contain incorrect or invented information.
Google has already paid, in the past, the error of one of its AIs. Last February, when CEO Pichai revealed Bardjust one generative AI demo contained an error that caused the loss 100 million dollars to the American company.
But in Mountain View they don’t seem to have learned their lesson.
The Gemini demo it is now circulating as a “fake” on social networks, undermining the credibility of the new AI. And the risk is that Google, in addition to losing money, this time will also lose ground in the AI race.