See below the text of my letter criticizing the thesis proposed by J. Gorraiz, the author of the paper, who is affiliated with the University of Vienna.
J. Gorraiz’s recently published paper presents a conceptually stimulating and metaphor-laden examination of the ideological foundations of bibliometrics, tracing their origins to religious, moral, and philosophical traditions. While such a reflective approach is thought-provoking, the paper suffers from several substantive limitations—particularly when read in light of ongoing debates surrounding the practical utility, cost-efficiency, and predictive power of bibliometric tools in research evaluation.
The paper’s central thesis—that bibliometrics derive from religious and philosophical traditions—is built on an extended metaphorical scaffolding. Citations are likened to divine judgment, H-indexes to spiritual tallies, and “sleeping beauties” to secular miracles. While these metaphors may have rhetorical appeal, they ultimately distract from more pressing empirical and methodological issues. There is no engagement with recent literature on field-normalized citation metrics, responsible metric frameworks (such as the Leiden Manifesto or DORA), or citation dynamics in different disciplines. Nor does the paper propose concrete methodological or policy alternatives. The result is a text rich in allegory but impoverished in evidence, leaving readers without clear guidance for improving bibliometric practice.
Gorraiz asserts that a low citation count does not imply irrelevance; it may reflect novelty. While this claim holds some truth, the argument is selectively framed and omits crucial empirical counterevidence. Notably, he fails to mention the robust findings from Clarivate Analytics, whose “Citation Laureates” methodology—based on identifying papers with exceptionally high citation counts (over 1,000)—has successfully predicted more than 70 Nobel Prize winners. These results strongly suggest that novelty and high citation impact are not mutually exclusive, and in fact, may often coincide. By disregarding this evidence, the paper constructs a false dichotomy between citation count and originality, while ignoring one of the most compelling demonstrations of bibliometrics' predictive capacity.
A more significant omission is the lack of engagement with the economic rationale for bibliometric tools. The author interprets citation indicators as symbolic or ritualistic, but largely ignores their function as scalable, cost-effective proxies in research evaluation systems strained by the limits of peer review. Peer review is resource-intensive: national assessments such as the UK's REF have cost upwards of £250 million per cycle. Hiring committees, tenure reviews, and grant panels demand vast investments of time and expert labor. In contrast, bibliometrics—despite their imperfections—offer reproducible, transparent screening mechanisms that can reduce the burden on evaluators. Any critique that sidesteps these economic realities and offers no viable alternative risks being philosophically interesting but operationally irrelevant.
Which approach is more detrimental to the progress of science: implementing a hybrid model of abbreviated peer review augmented by quantitative metrics—thereby conserving substantial financial resources—or relying exclusively on comprehensive, resource-intensive peer review protocols that allocate those funds away from direct research support? Moreover, how might the latter paradigm exacerbate inequities in research assessment for low-income countries, which lack the financial capacity to underwrite such costly evaluation processes?
Finally allow me to provide you with some insights into my homeland, Portugal, which has experimented with both approaches. In a prior Portuguese research assessment conducted in 2013, the international experts serving on the evaluation panels enjoyed complete autonomy. They had the freedom to evaluate research units through on-site visits and also had access to a comprehensive bibliometric analysis, utilizing data from Scopus, which was expertly conducted by Elsevier and generated a range of valuable metrics (Publications per FTE, Citations per FTE, h-index, Field-Weighted Citation Impact, Top cited publications, National and International Collaborations).
However, in recent years, we experienced a shift in perspective, with a Science Minister who shared similar sentiments with those critical of bibliometrics. During the most recent research assessment in 2018, which involved the evaluation of 348 research units comprising nearly 20,000 researchers, the Evaluation Guide clearly dictated that absolutely no metric could be used by the panels (note that all panels were composed by international experts, 51 from UK, 21 from USA, 17 from Germany, 17 from France, 11 from The Netherlands, 8 from Finland, 8 from Ireland, 7 from Switzerland, 6 from Sweden, 5 from Norway and also from other countries).
Nonetheless, once the research assessment had concluded, I conducted an extensive search through all the reports across various scientific areas. What I discovered was that the reviewers assigned significant importance to the quantity of publications and the perceived “quality” of journals, even though such considerations were expressly prohibited by the Evaluation Guide. I found that “publications”, “quartiles” and even “impact factors” were mentioned in the assessment reports more than 500 times. Meaning that in the absence of any metric the international experts (somewhat ironically) decide to use the worst of them all.