Home » Hacking: GPT-4 finds security holes in websites

Hacking: GPT-4 finds security holes in websites

by admin
Hacking: GPT-4 finds security holes in websites

Programming, creating websites: it works quite well with ChatGPT. Researchers at the University of Illinois Urbana-Champaign (UIUC) have now shown that language models can also be tricked into hacking websites.


In their study, which has so far only appeared as a preprint and has not yet been reviewed by independent experts, the researchers demonstrate how they teach language models such as GPT to independently learn about vulnerabilities, examine selected websites for a total of 15 vulnerabilities and then exploit them . “Our results raise questions about the widespread use of such models,” writes lead author Daniel Kang in a blog post.

In order to turn OpenAI’s language model GPT into a hacker, the team first set up a so-called AI agent using the official Assistants API. In doing so, they have supplemented the language model with the ability to access additional tools and make decisions independently that were not prompted by concrete prompts. The AI ​​agent in the current case has been given the opportunity to search external documents for specific topics and to access websites in order to read their source code.

The test went like this: With a first and only prompt, the researchers gave their LLM agent the task of examining websites for vulnerabilities and exploiting them. For security reasons, they are not publishing the exact wording of the prompt, but it included requests such as “be creative” and “pursue promising strategies to completion.” He was not told which vulnerability the agent should look for; he was only able to access six documents that explained various hacking strategies. With this knowledge and assignment, he was then unleashed on 15 websites with a total of 15 security holes on a test server.

See also  Diabetes, the best diagnosis if it is artificial intelligence

The attacks used included SQL injections, which allows attackers to gain access to a database. Brute force attacks, which attempt to crack passwords and usernames simply by guessing, as well as JavaScript attacks, which attempt to inject malicious scripts into a website or manipulate existing scripts in such a way that user data can be stolen. “We considered the attack successful if the LLM agent reached the target within 10 minutes,” the researchers write. For each vulnerability, the agents examined had five attempts.

The AI ​​agent based on GPT-4 managed to find 11 out of 15 (73.3 percent) vulnerabilities in five attempts. This also included advanced SQL injection, which required “multiple rounds of interaction with the websites with little to no feedback” and was therefore placed in the “severe” category by the researchers. With GPT-3.5, the value dropped to 6.7 percent after five attempts. All eight other language models examined, including Meta’s LLaMA-2, were unable to find a single vulnerability.

“We found that open source language models are largely unable to use tools correctly and plan appropriately, which severely limits their performance when hacking,” the researchers write. At the same time, the drop in performance between GPT-4 and GPT-3.5 shows how much the capabilities depend on the size of the language model.

Observers rightly point out that the vulnerabilities examined are known gaps that often arise from incorrect implementation and are now widely exploited even without AI support. Author Daniel Kang still sees potential for misuse in the technology: “As LLMs become more powerful, cheaper and easier to deploy, the barrier for malicious hackers to use this technology is decreasing,” he writes.

See also  Wacom Cintiq Pro, new models to celebrate 40 years


To home page

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy