A reviewer, tasked with evaluating a technical submission for NeurIPS 2025—the most prestigious AI conference in the world—left a comment that will go down in infamy:

“Who is Adam?”

For the uninitiated, asking “Who is Adam?” in a Deep Learning paper is akin to a mechanic asking “What is a wheel?” or a chef asking “What is salt?”

Adam (Adaptive Moment Estimation) is arguably the most popular optimization algorithm in modern deep learning. It has been the default optimizer for nearly a decade. It is not a person the authors forgot to cite. It is not an obscure character in the paper’s narrative. It is the math that makes the models learn.

That a reviewer at NeurIPS—a venue that received over 21,000 submissions this year—could ask this question is funny. But the laughter stops quickly when you realize what it actually implies: The peer review system for high-stakes ML conferences is cracking under the weight of AI-generated slop and unqualified reviewers.

The “Adam” Archetype: Hallucination or Incompetence?

The “Who is Adam?” incident (and the resulting meme) highlights two distinct but equally damaging possibilities regarding the state of academic review in 2025:

The “Lazy LLM” Review

The leading theory is that the review wasn’t written by a human at all. The reviewer likely fed the paper into a Large Language Model (LLM) with a generic prompt like “Find weaknesses in this paper.”

LLMs, despite being trained using Adam, can hallucinate “Adam” as an entity—specifically a person—when analyzing text structurally. If the paper mentions “We optimize using Adam,” a confused language model might flag “Adam” as an undefined proper noun if it lacks the specific context of that sentence.

We are seeing a flood of reviews that are “verbose but empty,” filled with bullet points that summarize the abstract rather than critique the method. “Who is Adam?” is just the most visible artifact of reviewers using AI to do their homework.

The Unqualified Reviewer

The second possibility is almost worse: The reviewer is a human, but one so junior or far removed from the field that they genuinely don’t know what the Adam optimizer is.

With submission numbers exploding (up nearly 100% since 2020), conferences are desperate for bodies. This leads to a frantic recruitment of reviewers, including undergraduates or researchers from adjacent fields (like pure statistics or biology) who may lack the specific vocabulary of modern Deep Learning.

The Consequence: High-Stakes Noise

Beyond the Meme: The Human Toll

While we chuckle at the absurdity of asking who “Adam” is, the implications are anything but a joke. For a PhD student on the cusp of graduation, a rejection caused by such staggering incompetence can be catastrophic—delaying defenses, killing momentum, or forcing them to scrap years of valid work. The laughter on social media masks a grim reality: the gatekeepers of our most advanced science are failing, and the toll is being paid by young researchers whose futures depend on a fair hearing that they simply aren’t getting.

From Human Bias to Automated Indifference

It is notorious that the quality of peer review in top computer science conferences has had deep-seated problems even without the interference of AI; the field has long battled issues of high variance, reviewer lottery, and inconsistent standards. However, the situation seems to be deteriorating rapidly with LLMs acting as the judge. We have moved from a system flawed by human bias and fatigue to one threatened by automated indifference, where the distinct voice of a critical expert is replaced by the smoothed-out, confident hallucinations of a model that knows the syntax of a review but none of the substance.

The Universal Collapse of Scaled Judgment

Furthermore, this crisis of evaluation is not unique to academic peer review; it is a symptom of a broader collapse in how we scale judgment across complex systems. We see the exact same dysfunction in the promotion cycles of Big Tech companies, where engineers are often evaluated by committees far removed from their actual work, relying on gameable “impact metrics” rather than technical reality. It echoes in hiring pipelines where AI-generated resumes are filtered by AI-generated screeners, creating a feedback loop of noise. Whether it’s a “Promo Packet” or a “Conference Submission,” when the volume of content outpaces the availability of expert attention, the system defaults to heuristics, randomness, and increasingly, the hollow mimicry of intelligence provided by AI itself.

Can We Fix the Machine?

To truly fix the collapse of judgment systems due to scale, metric-gaming, and AI interference, we need to move beyond band-aids like “banning ChatGPT” and look at structural changes that address the root causes of noise and scale:

“Skin in the Game” Mechanisms (Proof of Work)

We must move away from “passive” reviewing to “active” verification. In conferences, this could mean reviewers must correctly answer three specific comprehension questions set by the authors (e.g., “What is the specific value of hyperparameter $\alpha$?”) before their score is registered. This directly counters “automated indifference” because a lazy reviewer or an LLM cannot easily perform specific, context-heavy verification tasks without hallucinating.

De-Scaling: The Federation Model

Centralized systems handling 21,000 submissions inevitably default to noise. We should break the monolithic “NeurIPS” into smaller, high-trust, semi-autonomous tracks or sub-conferences that are capped in size. Acceptance is determined by local peers who actually know the work, allowing the “distinct voices” of experts to be heard over the statistical noise of a massive general pool.

Cap-and-Trade for Submissions

We need to artificially constrain the supply of content to match the available attention of experts. By limiting senior researchers to a fixed number of submissions per year (e.g., 3 slots as senior author), we force a “quality over quantity” filter at the source. If a researcher has limited slots, they won’t waste one on a half-baked paper that invites an AI-generated review, disrupting the feedback loop of noise.

Until then, if you submitted to NeurIPS 2025 and got a rejection, take a close look at the reviews. If they ask you who Adam is, just remember: It’s not you, it’s them.