ChatGPT and Co. provide answers to all sorts of questions of everyday life and can even help out with quick-witted replicas. However, the user has to – obviously – look at his smartphone or his computer from time to time – and that gives himself away. Until now.
At the end of April, students from Stanford University demonstrated RizzGPT, a system that follows conversations via microphone, converts them into text and displays the answers from ChatGPT on glasses in the user’s field of vision – as a kind of real-time prompter for all situations. However, Bryan Hau-Ping Chiang and his comrades did not release the app at the time.
Brilliant Labs, the maker of the AR hardware Monocle that Chiang and his team used for the clever hack, is now catching up. The iOS app arGPT is now available for the device. Brilliant Labs shows the application in action in a short YouTube clip.
Recommended Editorial Content
With your consent, an external YouTube video (Google Ireland Limited) will be loaded here.
Always load YouTube video Load YouTube video now
Components of the AR lens
Monocle consists of an eyeglass attachment lens that houses a microphone, a camera, FPGA accelerator chips and a micro-display that overlays information into the user’s field of vision. The fully open source developer kit has been available since February 2023 for $349. The startup’s investors include Brendan Iribe, co-founder of Oculus, Adam Cheyer, co-founder of Siri, and Eric Migicovsky, founder of Pebble.
Image 1 of 6 The AR hardware called “Monocle” consists of an attachment lens for attaching to glasses. (Image: Brilliant Labs)
In an interview with MIT Technology Review, Bobak Tavangar, CEO of Brillant Labs, confirms that the arGPT app works in principle the same way as RizzGPT: Monocle forwards the processed audio signal from the microphone to the smartphone, which transmits the signals to the speech model for processing Whisper from OpenAI hands over. The transcription of Whisper in turn serves as input for ChatGPT. Its output is then displayed in the user’s field of vision via the AR display, which can simply be attached to glasses. According to Tavangar, this works with “a delay of one to two seconds”. The main difficulty was getting iOS to “process the audio signal reliably, even when the smartphone is in your pocket. That was not trivial.”
The camera built into the monocle is not used. Not yet, as Tavangar points out, because the company’s goal is “to let generative AI see and hear what we hear and see.” This can be used for “simple face recognition” or gesture control. GPT-4 from OpenAI can also process images, but this function is not yet available via API. But it could start in early 2024. “We’re in the starting blocks and will get started as soon as we can,” he says.
Alienated image for reality
In the meantime, Brilliant Labs wants to build a kind of electronic hallucination function into its app: the camera image should go to an image-generating AI like Dall-E, which alienates the input image according to the user’s instructions and then overlays the alienated image in the field of view fades in.
Brilliant Labs is working on a technology dream that is more than 30 years old: As early as the mid-1990s, a group of young researchers at the MIT Media Lab, who self-deprecatingly called themselves cyborgs, experimented with wearable computers that gave them “context-dependent” access for information – such as the names of people they are speaking to (one of the pioneers of the movement, Thad Starner, says he has a very bad memory for faces and names).
However, portable cameras, which may also be coupled with facial recognition, have also led to violent defensive reflexes in the population in the past. The failure of Google Glass is a clear sign of this. However, Tavangar emphasizes that the goal of his company is to localize the corresponding AI models. If the glasses are used in Europe, for example, they should only allow connections to models that “correspond to the locally applicable regulations”. If the European AI Act were passed as voted in Parliament, real-time facial recognition would be banned in the EU.
To home page