Home » What awaits us in artificial intelligence in 2024

What awaits us in artificial intelligence in 2024

by admin
What awaits us in artificial intelligence in 2024

What awaits us in artificial intelligence in 2024

This time last year we did something daring. In an industry where nothing stands still, we tried to predict the future of artificial intelligence (AI). The hot question first: How did we do?

Advertisement

Our four big bets for 2023 were:

The next big thing in chatbots will be multimodality
Check: That’s how it happened. The most powerful large language models available, OpenAI’s GPT-4 and Google DeepMind’s Gemini, work with text, images and audio.

Now we’re taking the same step again. In doing so, we decided to ignore the obvious. We know that the large language models will continue to dominate. Regulators will become bolder. The problems of AI – from bias to copyright to ignorance – will dominate the agenda of researchers, regulators and the public, not just in 2024 but for years to come.

Instead, this time we’ve picked out a few more specific trends to watch out for in 2024. (Next year we’ll reveal again how we did).

In 2024, tech companies that have invested heavily in generative AI will be under pressure to prove they can make money from their products. To this end, AI giants Google and OpenAI are betting big on “going small”: both are developing user-friendly platforms that allow people to customize powerful language models and create their own mini chatbots tailored to their specific needs – without any programming knowledge required. Both have launched web-based tools that allow anyone to become a developer of generative AI applications.

In 2024, generative AI could also become useful for normal, non-technical people, and there will be more and more people tinkering with a million small AI models. The multimodality of modern AI models like GPT-4 and Gemini – they can process not only text, but also images and even videos – could unlock a whole range of new applications. For example, a real estate agent can upload text from previous listings, tune a powerful model to generate similar text with just a click of a mouse, upload videos and photos of new listings, and simply ask the customized AI to generate a description of the property.

But of course the success of this plan depends on whether these models work reliably. Language models often invent things, and generative models are subject to bias. They are also easy to hack, especially if they are allowed to browse the Internet. Tech companies have not solved any of these problems. When the novelty wears off, they need to offer their customers ways to deal with these problems.

See also  DAZN is planning a free subscription: free Bundesliga broadcasts soon?

It’s amazing how quickly the fantastic becomes familiar. The first generative models that produce photorealistic images entered the mainstream stage in 2022 – and quickly became commonplace. Tools like OpenAI’s DALL-E, Stability AI’s Stable Diffusion, and Adobe’s Firefly flooded the internet with stunning images, from the Pope in Balenciaga to award-winning art. But it’s not just funny: For every pug waving pom-poms, there’s another piece of fake fantasy art or sexist sexual stereotyping.

The new goal is text-to-video. One can expect that everything that was good, bad or ugly about text-to-image will take on larger-than-life dimensions. A year ago, we got our first glimpse of what generative models can do when they were trained to stitch together multiple still images into clips lasting just a few seconds. The results were distorted and jerky. But the technology has improved quickly.

Runway, a startup that makes generative video models and co-developed Stable Diffusion, releases new versions of its tools every few months. The latest Gen 2 model still produces videos that are only a few seconds long, but the quality is now impressive. The best clips aren’t far off from what Pixar could release.

Runway has launched an annual AI film festival that showcases experimental films made using a range of AI tools. This year’s festival has a prize money of $60,000 and the ten best films will be shown in New York and Los Angeles.

It’s no surprise that major film studios have taken notice of the project. Film giants like Paramount and Disney are now exploring the possibilities of generative AI across their production pipeline. The technique is used for lip-syncing actors with several foreign language overdubs, i.e. sound recordings mixed in later. To do this, she reinvents the possibilities of special effects. Last year, Indiana Jones and the Wheel of Fortune starred a digitally rejuvenated deepfake of Harrison Ford. And that is just the beginning.

Deepfake technology is also on the rise off the big screen for marketing and training purposes. The British company Synthesia, for example, makes software that can turn an actor’s one-off performance into an endless stream of deepfake avatars that recite predetermined scripts at the push of a button. According to the company, its technology is now used by 44 percent of Fortune 100 companies.

The ability to achieve so much with so little effort raises serious questions for actors. Concerns about studios’ use and misuse of AI were at the heart of last year’s SAG-AFTRA strikes. SAG-AFTRA (Screen Actors Guild-American Federation of Television and Radio Artists) is the US union for actors, voice actors, dancers and other film and television professionals. The true impact of technology on the film industry is only now becoming clear. “The craft of filmmaking is fundamentally changing,” says Souki Mehdaoui, an independent filmmaker and co-founder of Bell & Whistle, a consultancy specializing in creative technology.

See also  Roberts Radio Stream 67 im Test: Multiroom-Radio mit DAB+, Streaming & CD-Player

Based on the experience of the last election, AI-generated election disinformation and deepfakes are likely to be a major problem when record numbers of people go to the polls in 2024. We are already seeing politicians using these tools as weapons. In Argentina, two presidential candidates created AI-generated images and videos of their opponents to attack them. In Slovakia, deepfakes of a liberal, pro-European party leader threatening to raise beer prices and making jokes about child pornography spread like wildfire during the election. Finally, in the USA, Donald Trump cheered on a group that used AI to create memes with racist and sexist sayings.

While it is difficult to say to what extent these examples influenced the election outcome, their prevalence is a worrying trend. It’s becoming harder than ever to tell what’s real online. In an already heated and polarized political climate, this could have serious consequences.

Just a few years ago, creating a deepfake would have required advanced technical skills, but generative AI has made it damn easy and accessible. The results look more and more realistic. Even reputable sources can be deceived by AI-generated content. For example, user-submitted AI-generated images purporting to depict the Israel-Gaza crisis have flooded photo agencies like Adobe’s.

The coming year will be crucial for those fighting the spread of such content. Techniques for detecting and defusing are still in the early stages of development. Watermarks, like Google DeepMind’s SynthID, are still largely voluntary and not completely foolproof. Additionally, social media platforms are notoriously slow when it comes to removing misinformation. So get ready for a major real-time experiment to combat AI-generated fake news.

Inspired by some of the core techniques that sparked the generative AI boom, roboticists are working on general-purpose robots with a broader range of tasks.

See also  Edge AI: Use artificial intelligence on embedded systems

In recent years, AI has evolved from smaller models trained for individual tasks—such as image recognition, drawing, and annotation—to monolithic models trained for all of these tasks and more. By showing OpenAI’s GPT-3 a few additional examples (known as fine-tuning), researchers can train it to solve programming tasks, write movie scripts, and pass high school biology exams. Multimodal models such as GPT-4 and Google DeepMinds Gemini can solve both visual and linguistic tasks.

The same approach can also be used for robots, so there would be no need to train one robot to flip pancakes and another to open doors: a one-size-fits-all model could give robots the ability to multitask. Several examples of work in this area were presented last year.

In June, DeepMind released Robocat (a 2022 update to Gato), which generates its own data through trial and error – to learn to control many robot arms instead of the usual control of a specific arm. In October, the company released another general-purpose robot model, RT-X, and a large new general-purpose training data set in collaboration with 33 university laboratories. Other top research teams, such as RAIL (Robotic Artificial Intelligence and Learning) at the University of California Berkeley, are working on similar methods.

The problem, however, is a lack of data. Generative AI relies on an internet-sized data set of texts and images. In comparison, robots have few good sources of data from which to learn how to perform many of the industrial or domestic tasks we desire.

Lerrel Pinto’s team at New York University is addressing this problem. It develops techniques for robots to learn through trial and error while obtaining their own training data. In an even more low-profile project, Pinto recruited volunteers to use iPhones attached to trash collectors to collect video data from their surroundings. In recent years, large companies have also started publishing large data sets for training robots. An example of this is Meta’s Ego4D.

This approach has already shown promise in driverless cars. Startups like Wayve, Waabo and Ghost are pioneering a new wave of self-driving AI that uses a single large model to control a vehicle instead of multiple smaller models to control specific driving tasks.

This allows small businesses to compete with giants like Cruise and Waymo. Wayve is already testing its driverless cars on the narrow, busy streets of London. A similar upswing awaits robots everywhere.

(jl)

To home page

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy