ChatGPT: Why AI detection tools are so easy to cheat
Shortly after ChatGPT was launched, it was feared that schoolchildren and students could use the chatbot to have passable essays created in a matter of seconds, which they could then submit as term papers. The fear is not unfounded, as the OpenAI tool – like some of its competitors – spits out surprisingly good texts. No wonder, then, that several start-ups are striving to develop software whose purpose is to recognize AI-generated text.
Advertisement
The problem, however, is that it’s relatively easy to trick these tools and bypass detection. This is the result of a new study, which has not yet been peer-reviewed. Debora Weber-Wulff, Professor of Media and Computer Science at the Berlin University of Applied Sciences (HTW), worked with a group of researchers from different universities to assess the capability of 14 tools, including Turnitin, GPT Zero and Compilatio to recognize texts written by OpenAIs ChatGPT.
Most of these programs look for characteristics of AI-generated text, including certain forms of repetition — and then calculate the probability that the text was generated by an AI. However, the research team found that each and every tool tested had difficulty recognizing ChatGPT-generated text that had previously been easily rearranged by humans and/or obfuscated by a paraphrasing tool. “These tools don’t work,” is Weber-Wulff’s verdict. “They just don’t do what they say they do. These aren’t detectors for AI.” The result of the investigation indicates that schoolchildren and students only have to slightly adapt work generated by an AI in order to get past such detectors.
Various human and AI texts
How did the researcher and her colleagues proceed in the study? In order to have a selection of human-generated texts to study, they wrote short essays at the undergraduate level on a range of subjects including civil engineering, computer science, economics, history, linguistics and literature. They rewrote the texts to make sure they didn’t appear in the ChatGPT training data.
Then each researcher wrote an additional text in Bosnian, Czech, German, Latvian, Slovak, Spanish or Swedish. These texts were translated into English either by the AI translation tool DeepL or by the competitor Google Translate.
The team then used ChatGPT to generate two more texts at a time. They modified these slightly to obfuscate the AI origin. One area was manually edited by the researchers, rearranging sentences and swapping words, while another was rewritten using an AI paraphrasing tool called Quillbot. In the end, the group had 54 documents on which to test the recognition tools.
The scientists quickly found that while the tools were good at recognizing human-written text (with an average accuracy of 96 percent), they performed worse when it came to recognizing AI-generated text — and it did especially if it has been edited. Although the tools identified ChatGPT text with an accuracy of 74 percent, that rate dropped to 42 percent when the text generated by ChatGPT had been modified even slightly.
Advertisement
Serious consequences for academic careers
The study also shows how outdated universities’ current methods of evaluating student work are, comments Vitomir Kovanović, a senior lecturer who develops machine learning and artificial intelligence models at the University of South Australia but was not involved in the research project. Daphne Ippolito, a senior scientist at Google specializing in natural language generation who was also not involved in the project, also voices another concern.
“If automatic recognition systems are to be used in education, understanding their false positive rate is crucial, as falsely accusing a student can have serious consequences for their academic career,” she says. “The false negative rate is also important, because if too much AI-generated text passes off as human-written, the recognition system isn’t useful.”
The company Compilatio, which develops one of the tools tested by the researchers, points out that its system only displays suspicious passages, which it classifies as potential plagiarism or content potentially generated by AI. “It is incumbent on the schools and teachers who mark the analyzed documents to validate the knowledge actually acquired by the author of the document. This can be done, for example, by using additional means of verification – oral examination, additional questions in a controlled classroom environment and more,” said a spokesman for Compilatio.
“In this way, the Compilatio tools are part of a true instructional approach that encourages learning of good research, writing and citation practices. The Compilatio software is a proofing tool, not a proofreader,” the company explained. Turnitin and GPT Zero did not immediately respond to a request for comment.
A question of pattern recognition
So far: “Our recognition model is based on the striking differences between the idiosyncratic, unpredictable nature of human writing and the highly predictable statistical signatures of AI-generated text,” said Annie Chechitelli, chief product officer of Cologne-based developer Turnitin.
“However, our AI text recognition feature merely alerts the user to the existence of such areas and highlights those where further investigation may be needed. It does not determine whether the use of AI writing tools is appropriate or inappropriate, or whether their use within the limits of the examination regulations and the instructions given by the teacher constitutes fraud or misconduct.”
We’ve known for a while that tools designed to recognize text written by AI don’t always work as they’re supposed to. Earlier this year, OpenAI unveiled a tool designed to recognize text produced by ChatGPT, admitting that it only flagged 26 percent of AI-written text as “probably AI-written.”
OpenAI pointed out to MIT Technology Review a section on its website warning that AI-generated content detection tools are “far from foolproof”.
Continued boom in products
But such failures haven’t stopped companies from launching products that promise to do the job, says Tom Goldstein, an assistant professor at the University of Maryland who wasn’t involved in the research.
“A lot of them aren’t very accurate, but they’re not all a complete disaster either,” he adds, noting that Turnitin managed to achieve some detection accuracy with a fairly low false positive rate.
And while studies highlighting the shortcomings of so-called AI text recognition systems are very important, it would have been helpful to extend the study to AI tools beyond ChatGPT, says Sasha Luccioni, a researcher at AI start-up Hugging Face.
For Kovanović, the whole idea of trying to recognize text written by AI is wrong. “Don’t try to recognize AI – just make it so that the use of AI is not the problem,” he says.
(jl)
To home page