That this paper has achieved such prominence within a corpus of more than 28,000 studies is not merely surprising; it is revealing of the field’s early intellectual dynamics. Although informed by a review of 364 publications and a broad international authorship, it introduces no new empirical evidence, testable propositions, or methodological advances, offering instead a wide-ranging interpretive synthesis produced at a moment when ChatGPT’s capabilities, uses, and institutional consequences were still rapidly evolving.
Despite these limitations, the paper has been elevated to cross-disciplinary authority, with citations spanning nearly all major fields—from social sciences and computer science to business, engineering, medicine, mathematics, and the humanities—allowing a provisional narrative to circulate across communities with markedly different standards of evidence and, in many cases, to substitute for empirical grounding that was not yet available.
The paper enumerates an extensive set of risks and opportunities—spanning education, labour, cybersecurity, bias, and governance—without prioritising their importance, assessing relative likelihood or severity, or translating concerns into concrete policy frameworks. Readers are thus left with a catalogue of what might matter rather than guidance on what matters most. Moreover, many of the issues framed as disruptive simply repackage long-standing debates associated with earlier digital technologies—such as automation-driven job displacement, plagiarism, and misinformation—yet the paper fails to clearly separate what is genuinely new about large language models from familiar cycles of technological alarmism.
PS - In late January 2026, OpenAI’s CEO publicly acknowledged that a recent iteration of ChatGPT had sacrificed writing quality in favor of technical performance. This admission highlights a broader tension: much of the literature that quickly became authoritative was produced while the technology itself remained unstable, with regressions recognized by its developers. The speed with which normative interpretations solidified stands in contrast to the absence of empirically grounded criteria for evaluating capabilities that were still in flux.