Reading about the surge of AI-generated peer reviews at the upcoming ICLR 2026 conference is like witnessing a plot twist in the evolution of scientific publishing. On one hand, AI promises to streamline the laborious peer review process, which traditionally has been painfully slow and inconsistent. On the other, this new ‘AI reviewing AI’ conundrum opens a Pandora’s box of trust and quality concerns that we can’t simply brush off.
The finding that roughly 21% of peer reviews were fully AI-generated—and over half showed signs of AI use—is startling but perhaps inevitable as LLMs become more accessible and capable. What’s interesting here is the nature of the AI’s “review”: verbose, bullet-point laden, even requesting non-standard statistical analyses that seasoned researchers might find baffling. It’s as if the AI reviewer was taking a wild stab at being thorough but missed the subtleties that a human expert would catch.
This raises the question—can an AI truly grasp the nuance, intent, and innovation in a research paper, or are we just getting fancy parrots regurgitating learned patterns? Finding hallucinated citations and vague feedback suggests the latter, highlighting that AI reviewers might be better seen as assistants rather than arbiters.
What’s more, the Pangram Labs initiative to scan and flag AI-generated content in manuscripts and peer reviews is a pragmatic, necessary step toward safeguarding scientific integrity. Yet, deploying automated tools to police AI use might morph into an arms race where detection tools and AI writing tools leapfrog each other.
Scientists reacting with a mix of suspicion and frustration is understandable. When a review ‘misses the point’ or provides incorrect numerical feedback, it not only risks the author’s publication prospects but could also distort the entire discourse around a novel idea.
So where do we go from here? Rather than outright banning AI in peer review, a balanced approach could be to leverage AI to draft initial feedback or highlight critical points, but have human experts validate and refine these insights. This hybrid model acknowledges AI’s utility in managing volume while preserving human judgment where it counts.
At the end of the day, this episode is a critical nudge for the research community to rethink peer review norms in the AI era. It’s time to get realistic: AI won’t replace human insight anytime soon but can certainly shake up the ecosystem—provided we keep our eyes open and our evaluation criteria sharp. Let’s embrace the tool, but not blindly—because science, after all, thrives on critical thinking, not just critical word count. Source: Major AI conference flooded with peer reviews written fully by AI

