
arXiv is tightening its policies around the use of large language models in scientific papers, introducing stricter penalties for researchers who submit papers containing clear evidence of unchecked AI-generated material.
The move comes as the widely used preprint repository faces increasing concerns about low-quality AI-generated research submissions across fields including computer science and mathematics.
Although papers posted to arXiv are not peer-reviewed before publication, the platform has become one of the primary distribution systems for early-stage scientific research and a major source of data for tracking academic trends.
According to Thomas Dietterich, chair of arXiv’s computer science section, papers containing “incontrovertible evidence” that authors failed to review AI-generated material will now trigger penalties.
“If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper,” Dietterich wrote Thursday.
ArXiv Targets Hallucinated References And AI Artifacts
Dietterich said evidence could include fabricated or hallucinated references, as well as visible comments exchanged between authors and AI systems that were mistakenly left inside submitted papers.
Under the updated policy, authors whose papers contain such evidence could face a one-year suspension from arXiv.
After the suspension period, those authors would only be permitted to submit new papers once their work has first been accepted by a recognized peer-reviewed publication venue.
Dietterich told 404 Media that the rule will operate as a “one-strike” policy.
However, moderators must first identify the issue and section chairs must independently confirm the evidence before penalties are applied.
Authors will also retain the right to appeal decisions.
Policy Does Not Ban Use Of Large Language Models
The updated rules do not prohibit researchers from using AI tools during writing or research preparation.
Instead, arXiv said authors remain fully responsible for the contents of submitted papers regardless of whether AI systems contributed to generating text or references.
Dietterich said researchers cannot avoid responsibility for “inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content” produced by AI systems if that material is included in their submissions.
The policy reflects growing concern across academic publishing over the increasing use of AI-generated text in scientific research.
Academic Publishing Faces Rising Concerns About AI Fabrication
ArXiv has already introduced several measures intended to reduce low-quality AI-generated submissions.
Among them is a rule requiring first-time contributors to obtain endorsements from established authors before posting papers.
The organization is also transitioning into an independent nonprofit after more than 20 years under the management of Cornell University, a change expected to provide greater fundraising flexibility for moderation and operational improvements.
Recent peer-reviewed studies have found rising numbers of fabricated citations in biomedical research papers, with researchers linking some of the trend to misuse of large language models.
The issue extends beyond academia, as AI-generated fabricated citations have also appeared in legal filings, journalism, and other professional documents.
Featured image credits: Magnific.com
For more stories like it, click the +Follow button at the top of this page to follow us.
