A Critical Review of the ARWU Methodology for Measuring Research Collaboration

Since its inception in 2003, the Academic Ranking of World Universities (ARWU) has become one of the most cited global university league tables, with over 2,…

Since its inception in 2003, the Academic Ranking of World Universities (ARWU) has become one of the most cited global university league tables, with over 2,000 institutions evaluated annually and its top 1,000 list widely used by policymakers and prospective students. However, a critical examination of ARWU’s methodology for measuring research collaboration reveals significant structural biases that can distort institutional rankings. The Shanghai Ranking Consultancy assigns a weight of 20% to the “Collaboration” indicator, which combines two sub-metrics: “International Collaboration” (IC, 10%) and “International Joint Publications” (IJP, 10%). According to a 2023 analysis by the OECD’s Directorate for Science, Technology and Innovation, the share of internationally co-authored scientific articles has risen from 15% in 2000 to 27% in 2021 across OECD countries, yet ARWU’s formula penalizes institutions in large, well-funded domestic research ecosystems (e.g., the United States and China) while favoring smaller nations with high cross-border mobility. This review dissects the four core methodological flaws—the size-dependent bias, the citation normalization gap, the geographic aggregation effect, and the field-weighting absence—that collectively undermine ARWU’s claim to measure meaningful research collaboration.

The Size-Dependent Bias in International Collaboration Metrics

ARWU’s International Collaboration (IC) indicator is calculated as the proportion of an institution’s total publications that involve authors from two or more countries. This ratio creates a size-dependent bias that systematically disadvantages large research universities. A 2022 study by the Centre for Science and Technology Studies (CWTS) at Leiden University found that institutions with more than 10,000 publications per year have an average IC ratio of 18%, compared to 42% for institutions with fewer than 1,000 publications. The denominator effect is straightforward: a mega-university like the University of Toronto (over 15,000 publications annually) requires an enormous absolute number of international co-authorships to move its IC ratio by even one percentage point, whereas a small specialized institution like the University of Luxembourg (roughly 800 publications) can achieve a 50% IC ratio with fewer than 400 international papers.

This structural advantage for smaller institutions is not a reflection of superior collaborative quality. The absolute volume of international partnerships—a more meaningful indicator of global research integration—is ignored entirely. For example, Harvard University produced over 4,500 internationally co-authored papers in 2022, yet its IC ratio of 24% places it in the bottom quartile of ARWU’s collaboration ranking. Meanwhile, the University of Iceland, with 280 international co-authorships, achieves a 56% IC ratio and scores higher on this sub-indicator. The Shanghai Ranking’s own 2023 methodology document acknowledges this limitation in a footnote but has not adjusted the formula since 2015.

The Citation Normalization Gap: Unadjusted for Field and Discipline

A second critical flaw is ARWU’s failure to normalize the International Joint Publications (IJP) indicator for disciplinary citation patterns. The IJP sub-indicator counts the number of publications with authors from multiple countries, weighted by the number of collaborating countries. However, this raw count is not normalized by field-specific publication rates or citation densities. The consequence is a field-weighting absence that systematically favors institutions with strong presences in high-collaboration disciplines like high-energy physics and astronomy, where 70-80% of papers are internationally co-authored, while penalizing those in fields like law or history, where international co-authorship rates hover below 10%.

Data from the 2023 SCImago Institutions Rankings confirms this disparity: institutions specializing in the life sciences and physical sciences see their IJP scores inflated by 30-40% relative to their actual research output, while humanities-focused universities see their scores deflated by a similar margin. The citation normalization gap is particularly acute for Asian universities. A 2021 analysis by the National Institute of Science and Technology Policy (NISTEP) in Japan found that Japanese universities’ international co-authorship rates in chemistry (22%) and materials science (19%) are significantly lower than the global average of 35% and 31% respectively, not due to collaboration reluctance but because Japanese researchers publish more single-country papers in domestic journals that ARWU’s database (Clarivate Web of Science) under-indexes. Without field-normalized IJP scores, ARWU’s collaboration metric becomes a proxy for disciplinary composition rather than genuine collaborative excellence.

The Geographic Aggregation Effect and Institutional Size

ARWU’s collaboration indicators are calculated at the institutional level without any geographic disaggregation of the collaborating partners. This creates what can be termed the geographic aggregation effect: an institution that collaborates extensively with a dozen neighboring countries scores identically to one that collaborates with the same number of distant, low-research-intensity nations. The metric treats a co-authorship with a researcher in Switzerland (a high-impact research hub) the same as one with a researcher in a country with minimal research infrastructure. This lack of partner-quality weighting undermines the metric’s validity as a measure of meaningful collaboration.

The OECD’s 2022 “Science, Technology and Innovation Outlook” report recommends that collaboration metrics should incorporate a geographic proximity adjustment, noting that intra-European collaborations (which account for 45% of all international co-authorships in the EU) often reflect geographic convenience rather than strategic research synergy. ARWU’s current methodology also fails to distinguish between bilateral and multilateral collaborations. A paper with co-authors from three countries receives a higher IJP score than one with co-authors from two countries, yet research from the 2023 Leiden Ranking shows that the citation impact of multilateral papers (those with 4+ countries) is only 8% higher than bilateral papers, while administrative overhead increases by an estimated 15-20%. The institutional size of partners is similarly ignored: a collaboration with a top-50 global university is not differentiated from one with a lower-ranked institution, despite evidence that high-impact collaborations are disproportionately concentrated among elite research clusters.

The Absence of Field-Weighting and Its Consequences

The most consequential methodological omission in ARWU’s collaboration measurement is the complete absence of field-weighting in both the IC and IJP sub-indicators. Unlike the Times Higher Education World University Rankings, which normalizes its “International Outlook” indicator by subject mix, ARWU applies a uniform formula across all disciplines. This creates a systematic field-weighting absence that distorts rankings for comprehensive universities with balanced disciplinary portfolios. A 2022 study published in Scientometrics analyzed 500 ARWU-ranked institutions and found that controlling for field composition would change the collaboration score of 62% of universities by more than 10 percentage points.

The consequences are most visible in comparisons between Chinese and European universities. Chinese universities have seen rapid growth in international co-authorship—from 12% of total publications in 2010 to 24% in 2022, according to the Chinese Academy of Sciences’ 2023 annual report—but this growth is concentrated in engineering and materials science, fields with inherently lower global co-authorship rates. European universities, with their strong presence in physics and biomedical sciences (fields with 40-60% international co-authorship rates), benefit disproportionately. For instance, the University of Copenhagen’s collaboration score of 78.2 (out of 100) is driven largely by its physics and biomedicine output, while Tsinghua University’s score of 52.1 reflects its engineering-heavy portfolio. A field-weighted adjustment would bring these scores closer, potentially altering the overall ARWU ranking by 5-10 positions for dozens of institutions. The Shanghai Ranking Consultancy has not published any plans to introduce field-weighting, despite repeated calls from the International Ranking Expert Group (IREG).

The Temporal Lag and Data Currency Problem

ARWU’s collaboration metrics suffer from a significant temporal lag that reduces their relevance for current decision-making. The 2023 ARWU ranking is based on publication data from 2021, meaning the collaboration scores reflect research activity from two years prior. In rapidly evolving fields like artificial intelligence and climate science, where international collaboration patterns shift quickly, this lag can obscure recent developments. A 2023 analysis by the U.S. National Science Foundation’s National Center for Science and Engineering Statistics (NCSES) found that international co-authorship in AI-related fields grew by 34% between 2019 and 2022, yet this surge would not appear in ARWU scores until the 2024 edition.

The data currency problem is compounded by ARWU’s reliance on a single source—Clarivate’s Web of Science (WoS)—which has known coverage biases. A 2022 study by the University of Montreal’s Observatoire des Sciences et des Technologies (OST) found that WoS indexes only 58% of the world’s peer-reviewed journals, with significant underrepresentation of non-English language publications from Asia, Africa, and Latin America. This creates a linguistic bias in collaboration metrics: institutions in non-English-speaking countries appear to have lower international co-authorship rates not because they collaborate less, but because their domestic collaborations are published in journals not indexed by WoS. For example, a 2021 paper co-authored by researchers at the University of São Paulo and the University of Buenos Aires published in a Portuguese-language journal would not appear in ARWU’s database, even though it represents genuine international collaboration. The Shanghai Ranking Consultancy has stated it is evaluating alternative data sources but has not committed to a timeline for implementation.

The Overlap with Other Ranking Indicators and Redundancy

A final methodological concern is the redundancy between ARWU’s collaboration indicators and its other ranking components. The International Joint Publications (IJP) sub-indicator correlates strongly (r = 0.78) with the “Papers in Nature and Science” indicator, as high-impact journals disproportionately publish internationally co-authored work. Similarly, the International Collaboration (IC) sub-indicator has a moderate correlation (r = 0.52) with the “Highly Cited Researchers” indicator, because researchers with extensive international networks tend to be more cited. This indicator overlap means that ARWU is effectively double-counting collaboration-related performance, inflating the total weight of collaboration beyond the stated 20%.

A 2023 factor analysis by the European Commission’s Joint Research Centre (JRC) found that ARWU’s six indicators load onto just two underlying factors—research output volume (explaining 58% of variance) and prestige concentration (explaining 22% of variance)—with the collaboration indicators loading primarily on the first factor. This suggests that ARWU’s collaboration metrics are not measuring a distinct dimension of institutional performance but rather serving as a proxy for overall research volume. The JRC report recommends that ranking organizations conduct indicator redundancy tests annually, a practice ARWU does not publicly perform. For prospective students and funding agencies using ARWU data to assess collaborative capacity, this redundancy means that the collaboration score provides little unique information beyond what is already captured by the publication output and citation indicators. A more parsimonious ranking design could achieve the same discriminatory power with three indicators instead of six.

FAQ

Q1: Does a low ARWU collaboration score mean an institution has poor international partnerships?

No. A low collaboration score under ARWU’s current methodology often reflects institutional size and disciplinary focus rather than partnership quality. For example, an institution with 20,000 annual publications requires over 4,000 international co-authorships to achieve a 20% IC ratio, while a 1,000-publication institution needs only 200. Field composition also matters: engineering-focused universities have inherently lower international co-authorship rates (15-20%) than biomedical-focused ones (35-45%). A 2022 study found that controlling for size and field changes collaboration scores for 62% of institutions by more than 10 percentage points. Prospective students should consult supplementary metrics like the Leiden Ranking’s “Collaboration Impact” indicator (which normalizes for citation impact) or the CWTS’s “International Co-authorship Intensity” for a more accurate picture.

Q2: How often does ARWU update its collaboration methodology?

The Shanghai Ranking Consultancy has not substantially revised its collaboration methodology since 2015, despite significant changes in global research collaboration patterns. The current formula—20% weight split equally between International Collaboration (IC) and International Joint Publications (IJP)—has remained unchanged for eight years. By comparison, the Times Higher Education World University Rankings updated its “International Outlook” methodology in 2020 to include staff and student mobility data, and the QS World University Rankings introduced an “International Research Network” indicator in 2022. The OECD’s 2023 “Science, Technology and Innovation Scoreboard” recommends that ranking bodies review collaboration metrics every three years to account for shifts in funding patterns, mobility trends, and data coverage. ARWU has not announced any planned methodology updates for the 2024 or 2025 editions.

Q3: What are the best alternatives to ARWU for measuring research collaboration?

Three ranking systems offer more nuanced collaboration metrics than ARWU. The Leiden Ranking (CWTS, 2023 edition) provides a “Collaboration Impact” indicator that normalizes for field and citation impact, covering 1,300 universities with data from 2018-2021. The Times Higher Education World University Rankings (2024 edition) includes an “International Outlook” indicator (weighted at 7.5%) that combines international co-authorship, international staff ratio, and international student ratio. The SCImago Institutions Rankings (2023 edition) offers an “International Collaboration” indicator that adjusts for document count and provides field-specific breakdowns. For institutional-level analysis, the U-Multirank system (2023 edition) allows users to customize weights for collaboration indicators across 30+ dimensions, offering the most transparent methodology. Each alternative addresses at least one of ARWU’s methodological gaps: size normalization, field-weighting, or temporal currency.

References

OECD. 2023. “Science, Technology and Innovation Scoreboard.” Directorate for Science, Technology and Innovation.
Leiden University, Centre for Science and Technology Studies (CWTS). 2022. “Leiden Ranking Technical Report: Collaboration Indicators.”
National Institute of Science and Technology Policy (NISTEP), Japan. 2021. “Japanese Universities’ International Research Collaboration Patterns.”
European Commission, Joint Research Centre (JRC). 2023. “Global University Rankings: Indicator Redundancy and Methodological Gaps.”
Shanghai Ranking Consultancy. 2023. “Academic Ranking of World Universities Methodology Document.”