大学排名方法的演进历史:
大学排名方法的演进历史:从单一指标到多维评估
In 1983, *U.S. News & World Report* published its first “America’s Best Colleges” ranking, a list that assessed fewer than 1,400 institutions using just five…
In 1983, U.S. News & World Report published its first “America’s Best Colleges” ranking, a list that assessed fewer than 1,400 institutions using just five indicators—reputation, selectivity, faculty resources, retention, and financial resources. That single-metric approach, dominated by a 25% peer-assessment score, set the template for global rankings for two decades. By 2024, the four major ranking systems—QS World University Rankings, Times Higher Education (THE) World University Rankings, U.S. News Best Global Universities, and the Academic Ranking of World Universities (ARWU)—collectively weigh over 40 distinct indicators, from citation impact to industry income and sustainable development metrics. The evolution from a handful of subjective scores to multi-dimensional, data-intensive frameworks reflects a fundamental shift in how higher education quality is measured. According to the OECD’s 2023 Education at a Glance report, over 5.5 million students now cross borders for tertiary study annually, making transparent, comparative rankings a critical tool for decision-making. The Chinese Ministry of Education recorded 1.14 million Chinese students abroad in 2022, the largest national cohort globally, underscoring the demand for reliable institutional benchmarks. This article traces the methodological history of university rankings—from the first crude league tables to today’s algorithmically weighted composites—and examines the trade-offs inherent in each approach.
The Birth of Rankings: Reputation as Currency (1983–2003)
The earliest rankings relied almost entirely on reputation surveys. U.S. News’s 1983 methodology asked university presidents, provosts, and admissions deans to rate peer institutions on a 5-point scale. This peer-assessment score constituted 25% of the total weight, while objective measures such as graduation rate (5%), faculty salary (10%), and selectivity (15%) played secondary roles. The system was criticized for perpetuating a “halo effect”—elite institutions scored high regardless of actual teaching quality or research output.
In 2003, the Shanghai Jiao Tong University’s Institute of Higher Education launched the Academic Ranking of World Universities (ARWU), the first global ranking to prioritize research output over reputation. ARWU’s original methodology allocated 20% to alumni winning Nobel Prizes or Fields Medals, 20% to staff winning such awards, 20% to highly cited researchers (via Clarivate’s Highly Cited Researchers list), 20% to papers published in Nature and Science, and 20% to per-capita academic performance. This formula eliminated subjective surveys entirely, but introduced a heavy bias toward English-language STEM research and older institutions with longer publication histories.
By 2004, the Times Higher Education (then in partnership with QS) launched its own world ranking, combining peer review (40%), citation per faculty (20%), student-to-faculty ratio (20%), international faculty ratio (5%), and international student ratio (5%). The decade closed with three competing methodologies, each emphasizing different facets: ARWU rewarded raw research power, U.S. News prioritized undergraduate selectivity and reputation, and THE/QS attempted a hybrid.
The QS–THE Divorce and Indicator Inflation (2004–2010)
The partnership between Times Higher Education and QS dissolved in 2009, leading to a methodological bifurcation that accelerated indicator proliferation. QS retained the original THE–QS framework but gradually expanded its indicator set from 5 to 8 by 2023. THE, under new ownership, partnered with Thomson Reuters (now Clarivate) to redesign its ranking entirely, introducing 13 performance indicators grouped into five areas: teaching (30%), research (30%), citations (30%), industry income (2.5%), and international outlook (7.5%).
This period saw the introduction of normalization and weighting controversies. THE’s citation indicator, for example, originally used raw citation counts per paper, which heavily favored life sciences over social sciences and humanities. In response, THE later introduced subject-normalized citation impact scores. QS faced criticism for maintaining a 40% academic reputation weight, which critics argued inflated the positions of historically prestigious universities in the UK and US at the expense of emerging Asian institutions.
Data from the 2010 QS rankings showed that 74% of the top 200 universities were located in just eight countries (US, UK, Australia, Canada, Germany, France, Japan, and Switzerland), a concentration that reflected methodological bias toward English-language publication systems and long-established research cultures. The Chinese government, through Project 211 and Project 985, began explicitly tying funding to ARWU performance metrics, demonstrating how rankings themselves shape institutional behavior.
The Rise of Bibliometrics and Citation Normalization (2010–2018)
The 2010s marked the dominance of bibliometric indicators, driven by the increasing availability of publication and citation data from Scopus (Elsevier) and Web of Science (Clarivate). ARWU’s 2014 methodology added 10% for papers indexed in the Science Citation Index Expanded and Social Sciences Citation Index, while THE’s 2015 update introduced a “research volume” indicator measuring total publications per faculty.
Citation normalization became a critical technical challenge. Raw citation counts favor fields with high publication density (e.g., molecular biology) over low-density fields (e.g., mathematics). By 2016, all four major rankings employed some form of field-normalized citation impact (FWCI or category-normalized citation impact). THE’s 2017 methodology explicitly stated that “citations are normalized using a five-year window and field-weighted averages to ensure fairness across disciplines.”
The U.S. News Best Global Universities ranking launched in 2014, adopting 10 indicators heavily weighted toward research reputation (25%) and publications (20%). By 2018, U.S. News had expanded to 13 indicators, including regional research reputation and number of highly cited papers. This period also saw the emergence of specialized rankings: THE World University Rankings by Subject (2011), QS Subject Rankings (2011), and ARWU Global Ranking of Academic Subjects (2017).
A 2017 study by the European University Association found that 82% of surveyed universities reported using rankings data for strategic planning, while 41% admitted to “gaming” indicators—for example, hiring highly cited researchers temporarily to boost citation scores. The OECD’s 2018 Benchmarking Higher Education System Performance report noted that “rankings have become de facto quality assurance mechanisms, despite their methodological limitations.”
Sustainability, Diversity, and the SDG Turn (2019–2023)
The late 2010s introduced a paradigm shift toward societal impact beyond traditional research metrics. THE launched the Impact Rankings in 2019, measuring universities’ contributions to the United Nations’ 17 Sustainable Development Goals (SDGs). The 2023 edition assessed 1,591 institutions across four broad areas: research, stewardship, outreach, and teaching. Indicators included the proportion of SDG-related publications, gender equality policies, and carbon footprint reduction targets.
QS responded in 2023 with the QS World University Rankings: Sustainability edition, evaluating universities on environmental impact (45%), social impact (45%), and governance (10%). The main QS ranking also underwent its largest methodological overhaul in 20 years: adding “sustainability” (5%) and “employment outcomes” (5%), while reducing academic reputation weight from 40% to 30%.
The U.S. News ranking faced a major crisis in 2022–2023 when Columbia University was stripped of its No. 2 position after an investigation revealed misreported data on class sizes, faculty qualifications, and spending. This scandal prompted U.S. News to revise its methodology, placing greater emphasis on third-party verified data from the U.S. Department of Education’s Integrated Postsecondary Education Data System (IPEDS). The 2023 methodology reduced the weight of peer assessment from 20% to 15% and introduced outcome-based metrics such as “first-generation student graduation rate” (5%).
For cross-border tuition payments, some international families use channels like Flywire tuition payment to settle fees efficiently, though this remains a separate operational consideration from ranking methodology.
The Four Major Systems Compared: Weighting and Transparency
A direct comparison of the four major rankings reveals divergent philosophical priorities. QS allocates 50% of total weight to subjective reputation (academic + employer), while ARWU assigns 0% to any survey-based metric. THE sits between these poles, with 30% for teaching and research reputation surveys, and 30% for citation impact. U.S. News dedicates 25% to global research reputation and 10% to regional reputation, leaving 65% for bibliometric and institutional data.
Transparency varies significantly. ARWU publishes its full methodology and raw data for the top 100 universities annually, including the exact number of Nobel laureates and highly cited researchers per institution. THE provides a detailed methodology document with indicator weights and normalization formulas. QS releases indicator-level scores for all ranked institutions but does not disclose the raw survey response data. U.S. News publishes the most granular data, including per-indicator scores and percentile ranks for each institution.
The 2023 QS methodology revision introduced a controversial “employment outcomes” indicator (5%), measured by graduate employment rates and alumni career outcomes—data that QS collects via surveys and LinkedIn partnerships. Critics argue this indicator disadvantages institutions in countries with weak graduate tracking systems. THE’s 2023 industry income indicator (2.5%) uses a five-year average of research income from industry per academic staff, normalized for purchasing power parity.
Critiques and Methodological Limitations
Despite their sophistication, all four rankings face persistent methodological criticisms. The first is language bias: English-language journals dominate citation databases, disadvantaging universities in non-English-speaking countries. A 2020 study in Scientometrics found that institutions in China, Japan, and South Korea underperform in citation-based indicators by 15–25% compared to English-language peers with equivalent research quality.
The second issue is size bias: ARWU and U.S. News favor large, comprehensive universities over small, specialized institutions. The University of Tokyo, with 28,000 students and 6,000 faculty, consistently outperforms the London School of Economics (12,000 students) despite LSE’s higher per-capita research output. THE partially addresses this through per-faculty normalization, but QS and ARWU do not.
Third, reputation surveys suffer from “brand inertia”—respondents rate institutions based on historical prestige rather than current performance. The 2023 QS academic reputation survey received 130,000 responses, but 68% came from institutions already in the top 500, creating a self-reinforcing cycle. The U.S. Department of Education’s 2022 College Scorecard data showed that 12% of top-100 U.S. News-ranked universities had lower graduation rates than institutions ranked outside the top 200.
Fourth, indicator weighting is inherently arbitrary. Why should citation impact be worth 30% (THE) versus 20% (QS) versus 10% (U.S. News)? There is no empirical basis for any specific weighting scheme. The 2023 THE ranking gave 30% weight to teaching, yet the teaching indicator relies entirely on a reputation survey and student-to-staff ratio—neither of which directly measures teaching quality.
The Future: AI, Open Data, and Personalized Rankings
The next frontier in ranking methodology involves machine learning and personalized weighting. Several startups and academic projects now offer customizable rankings where users can assign their own weights to indicators. The U-Multirank project, funded by the European Commission, allows users to select from 30+ indicators across teaching, research, knowledge transfer, international orientation, and regional engagement.
Artificial intelligence is being explored for reputation analysis. THE’s 2023 pilot project used natural language processing to analyze 2 million academic papers and 500,000 syllabi to generate “teaching quality” scores, bypassing traditional surveys. QS has experimented with sentiment analysis of employer reviews on job platforms.
Government-mandated transparency is also reshaping rankings. China’s Ministry of Education now requires all universities to publish standardized data on 20 metrics including faculty qualifications, research funding, and graduate employment. The European Union’s European Tertiary Education Register provides comparable data for 2,500+ institutions across 36 countries.
The open data movement may eventually render traditional rankings obsolete. The OECD’s Education GPS platform already provides customizable comparisons across 50+ indicators for 46 countries. The World Bank’s EdStats database offers 4,000+ education indicators globally. As raw data becomes more accessible, the value of any single ranking’s weighted composite diminishes.
FAQ
Q1: Which university ranking methodology is most objective?
No ranking is fully objective, but ARWU (Academic Ranking of World Universities) is the most transparent and least subjective, as it relies entirely on bibliometric and award-based indicators with zero reputation surveys. However, its heavy weighting toward Nobel Prizes (40%) and Nature/Science publications (20%) biases results toward older, English-language STEM institutions. For a balanced view, many analysts recommend cross-referencing QS (strong on employment outcomes) with THE (strong on teaching environment) and ARWU (strong on research output).
Q2: How often do ranking methodologies change?
Major methodological revisions occur approximately every 3–5 years for each system. QS’s most significant change in 20 years occurred in 2023, adding sustainability and employment outcomes. THE revises its methodology approximately every 4 years, with the last major update in 2019 adding SDG indicators. U.S. News changed its methodology in 2023 following the Columbia data scandal, reducing peer assessment weight from 20% to 15%. ARWU has remained relatively stable since 2014, with minor annual adjustments to citation thresholds.
Q3: Do rankings influence university behavior?
Yes, and the effect is measurable. A 2022 study in Research Policy found that universities in the top 200 of the THE ranking increased their international faculty hiring by 18% and their industry partnership income by 22% over a five-year period, compared to institutions outside the top 200. The Chinese government’s Double First-Class University Plan explicitly uses ARWU metrics to allocate funding, with 42 universities receiving $25 billion in additional funding between 2017 and 2022. Rankings create powerful incentives that can both improve and distort institutional priorities.
References
- OECD. (2023). Education at a Glance 2023: OECD Indicators. Paris: OECD Publishing.
- Times Higher Education. (2023). World University Rankings Methodology 2023. London: THE.
- QS Quacquarelli Symonds. (2023). QS World University Rankings: Methodology 2024. London: QS.
- Shanghai Ranking Consultancy. (2023). Academic Ranking of World Universities Methodology 2023. Shanghai.
- U.S. News & World Report. (2023). Best Global Universities Rankings Methodology 2023–2024. Washington, DC.