In February, a ridiculous, AI-generated rat penis managed to get published in a recently retracted Frontiers in Cell and Developmental Biology article. Now that unusual blunder appears to be just one example of a larger issue emerging in scientific literature. Journals are currently at a crossroads on how best to deal with researchers using popular but factually questionable generative AI tools to assist with writing papers or creating images. Detecting evidence of AI use isn’t always easy, but a new report from 404 Media study this week reveals what seems to be dozens of partially AI-generated published articles hiding in plain sight. The dead giveaway? Frequently used, computer-generated jargon.
404 Media searched the AI-generated phrase “As of my last knowledge update” in Google Scholar’s public database and reportedly found 115 different articles that seemed to have relied on copied AI model outputs. That specific set of words is one of many phrases frequently produced by large language models like OpenAI’s ChatGPT. In this case, the “knowledge update” refers to the period when a model’s reference data was updated. Chat. Other common generative-AI phrases include “As an AI language model” and “regenerate response.” Outside of academic literature, these AI artifacts have surfaced in Amazon product reviews, and on social media platforms.
Several of the papers cited by 404 Media apparently directly copied the AI text into peer-reviewed papers attempting to explain complex research topics like quantum entanglement and the performance of lithium metal batteries. Other instances of journal articles seemingly containing the common generative AI phrase “I don’t have access to real-time data” were also shared on X, formerly Twitter, over the weekend. At least some of the examples reviewed by PopSci were related to research into AI models. The AI statements, in other words, were part of the subject matter in those cases.
It gets worse. Apparently if you search "as of my last knowledge update" or "i don't have access to real-time data" on Google Scholar, tons of AI generated papers pop up. This is truly the worst timeline. pic.twitter.com/YXZziarUSm
— Life After My Ph.D. (@LifeAfterMyPhD) March 18, 2024
Although several of these phrases appeared in reputable, well-known journals, 404 Media asserts that most of the examples it found originated from small, so-called “paper mills” that specialize in rapidly publishing papers, often for a fee and without scientific scrutiny or thorough peer review.. Researchers have argued that the proliferation of these paper mills has led to an increase in spurious or plagiarized academic findings in recent years.
Unreliable AI-generated claims could result in more retractions
The recent instances of apparent AI-generated text appearing in published journal articles comes amid an increase in retractions overall. A recent Nature analysis of research papers published last year discovered more than 10,000 retractions, more than in any previous year. While most of those cases weren't linked to AI-generated content, concerned researchers have long worried that increased usage of these tools could lead to more false or misleading content making it through the peer review process. In the embarrassing rat penis case, the strange images and nonsensical AI-generated labels like “dissiliced” and “testtomcels” managed to slip by multiple reviewers either unnoticed or unreported.
There's a good reason to think that articles submitted with AI-generated text may become more common. In 2014, the journals IEEE and Springer combined. removed more than 120 articles found to have included nonsensical AI-generated language. The prevalence of AI-generated text in journals has almost certainly increased in the decade since then as more advanced, and easier to use tools like OpenAI's ChatGPT have gained wider adoption. surely increased in the decade since then as more sophisticated, and easier to use tools like OpenAI's ChatGPT have gained wider adoption.
A 2023 survey of scientists conducted by Nature found that 1,600 respondents, or around 30% of those polled, admitted to using AI tools to help them write manuscripts. And while phrases like “As an AI algorithm” are dead giveaways exposing a sentence's large language model (LLM) origin, many other more subtle uses of the technology are harder to root out. Detection models used to identify AI-generated text have proven frustratingly inadequate proven frustratingly inadequate.
Those who support allowing AI-generated text in some cases say it can help non-native speakers express themselves more clearly and potentially lower language barriers. Others argue the tools, if used responsibly, could speed up publication times and increase overall efficiency non-native speakers express themselves more clearly and potentially lower language barriers. Others argue the tools, if used responsibly, could speed up publication times and increase overall efficiency speed up publication times and increase overall efficiency. But publishing inaccurate data or fabricated findings generated by these models risks damaging a journal's reputation in the long term. A recent paper published in Current Osteoporosis Reports comparing review article reports written by humans and generated by ChatGPT found the AI-generated examples were often easier read. At the same time, the AI-generated reports were also filled with inaccurate references.
“ChatGPT was pretty convincing with some of the phony statements it made, to be honest,” Indiana University School of Medicine professor and paper author Melissa Kacena said in a recent interview with Time. “It used the proper syntax and integrated them with proper statements in a paragraph, so sometimes there were no warning bells.”
Journals should agree on common standards around generative AI
Major publishers still aren't aligned on whether or not to allow AI-generated text in the first place. Since 2022, journals published by Science have been strictly prohibited from using AI-generated text or images that are not first accepted by an editor. Nature, on the other hand, released a statement last year saying they wouldn’t allow AI-generated images or videos in its journals, but would permit AI-generated text in certain scenarios. JAMA currently allows AI-generated text but requires researchers to disclose when it appears and what specific models were used.
These policy divergences can create unnecessary confusion both for researchers submitting works and reviewers tasked with vetting them. Researchers already have an incentive to use tools at their disposal to help publish articles quickly and boost their overall number of published works. An agreed upon standard around AI generated content by large journals would set clear boundaries for researchers to follow. The larger established journals can also further separate themselves from less scrupulous paper mills by drawing firm lines around certain uses of the technology or prohibiting it entirely in cases where it’s attempting to make factual claims.