“The indicators should be revisited along the lines of the recommendations and suggestions already provided by the Berlin Principles. The Berlin principles place emphasis on league tables for universities that recognize the diversity of institutions, provide clear information about the indicators and target groups. The THES ranking fails to comply with the Berlin principles as, for instance, there is clearly a lack of information surrounding the construction of the two expert driven indicators.”
De kernconcluises van deze studie leest u hier:
The Academic Ranking of World Universities carried out annually by the Shanghai’s Jiao Tong University (mostly known as the ‘Shanghai ranking’) has become, beyond the intention of its developers, a reference for scholars and policy makers in the field of higher education. For example Aghion and co-workers at the Bruegel think tank use the index – together with other data collected by Bruegel researchers – for analysis of how to reform Europe’s universities, while French President Sarkozy has stressed the need for French universities to consolidate in order to promote their ranking under Jiao Tong. Given the political importance of this field the preparation of a new university ranking system is being considered by the French ministry of education.
The questions addressed in the present analysis is whether the Jiao Tong ranking serves the purposes it is used for, and whether its immediate European alternative, the British THES, can do better. Robustness analysis of the Jiao Tong and THES ranking carried out by JRC researchers, and of an ad hoc created Jiao Tong-THES hybrid, shows that both measures fail when it comes to assessing Europe’s universities.
Jiao Tong is only robust in the identification of the top performers, on either side of the Atlantic, but quite unreliable on the ordering of all other institutes. Furthermore Jiao Tong focuses only on the research performance of universities, and hence is based on the strong assumption that research is a universal proxy for education. THES is a step in the right direction in that it includes some measure of education quality, but is otherwise fragile in its ranking, undeniably biased towards British institutes and somehow inconsistent in the relation between subjective variables (from surveys) and objective data (e.g. citations).
JRC analysis is based on 88 universities for which both the THES and Jiao Tong rank were available. European universities covered by the present study thus constitute only about 0.5% of the population of Europe’s universities. Yet the fact that we are unable to reliably rank even the best European universities (apart from the 5 at the top) is a strong call for a better system, whose need is made acute by today’s policy focus on the reform of higher education. For most European students, teachers or researchers not even the Shanghai ranking – taken at face value and leaving aside the reservations raised in the present study – would tell which university is best in their own country. This is a problem for Europe, committed to make its education more comparable, its students more mobile and its researchers part of a European Research Area.
Various attempts in EU countries to address the issue of assessing higher education performance are briefly reviewed in the present study, which offers elements of analysis of which measurement problem could be addressed at the EU scale.
This report puts forward three findings and recommendations. The three findings are that:
1] While indicators and league tables are enough to start a discussion on higher education issues in Europe and benchmark it worldwide, they are not sufficient to conclude it. As already widely discussed in the literature, the choice of the indicators reflects more the league tables compilers’ opinion and the availability of internationally comparable data than the result of a consensus from the academic community. Both rankings rely highly on bibliometric indicators and thus they tend to be biased towards Englishspeaking and hard sciences intensive institutions, leaving aside social and human sciences.
2] The THES and SJTU rankings should not be used to discuss about the determinants of university performance or to deliver policy messages on educational issues. Indeed, for the majority of the universities we analysed, the THES or SJTU rank have proven impossible to capture with adequate statistical robustness. The assigned university rank largely depends on the methodological assumptions made in compiling the two rankings. For instance, we cannot conclude that Paris VI University performs significantly better than McGill University though the difference in positions suggests a disparity in quality or performance. It implies that no conclusive inference regarding the relative performance of the majority of the universities can be drawn from either ranking.
3] The average US university is not necessarily superior to the average European university unlike most of the current conceptions might suggest. An analysis of the 27 European universities and 48 USA universities that are ranked Top100 in the SJTU and Top200 in the THES shows that the average US university is not necessarily superior to the average European university. The average US university has a better performance than the average European university in the number of articles in Science and Social Citation Index, in the number of highly cited researchers (SJTU indicators) and in the citations per faculty (THES indicator). Yet, the average European university has a better performance than the average American university in the proportion of international staff and the proportion of international students. For the remaining seven indicators analysed (in particular the two indicators related to the number of Alumni or Staff winning Nobel prizes and field medal), the performance of the average European university is comparable to the average US university. Regarding homogeneity issues, the European universities analysed have a more homogenous performance than their American counterparts.
We recommend that the university ranking systems can and should be improved as follows:
1] First, the indicators should be revisited along the lines of the recommendations and suggestions already provided by the Berlin Principles. The Berlin principles place emphasis on league tables for universities that recognize the diversity of institutions, provide clear information about the indicators and target groups. The principles also provide recommendations on the way data should be gathered, processed in a transparent way and how final rankings should be presented. The THES ranking fails to comply with the Berlin principles as, for instance, there is clearly a lack of information surrounding the construction of the two expert driven indicators.
2] Second, the compilation of university rankings should always be accompanied by a robustness analysis. We believe that this could constitute an additional recommendation to be added to the already 16 existing Berlin principles. The multi-modeling approach adopted in this report allowed us to show that the rank of most of the 88 institutions is highly dependant on the methodology chosen for the compilation of both rankings. In our study we have selected numerous scenarios that represent distinct, diverse and at times contradicting approaches in order to aggregate information on university performance. The multi modeling approach employed, has already proven to be useful in the development and validation of several composite indicators (e.g., Environmental Performance Index, Composite Learning Index, Alcohol Policy Index, Knowledge Economy Index) and was also included in the JRC/OECD Handbook on Composite Indicators. A comparative advantage of the multimodeling approach is that it can offer a representative picture of the classification of university performances. While university rankings can not inform us about the real position of most of universities, given the statistical uncertainty associated with the ranks, a multi-modeling approach, like the one implemented in this report, allows to rank institutions in a range bracket. The upshot is that this way of doing is probably better than assigning a specific rank which is not representative of the real performance of the university.
3] Third, the assessment of the universities performance based on the hybrid set of the twelve indicators used in the THES and SJTU rankings provides a more reliable average rank of the institutions. The two sensitivity measures we used showed that the impact of the methodological assumptions is much lower when using the set of twelve indicators as opposed to either the THES or the SJTU indicators alone. Given the diversity of the indicators, as confirmed by correlation analysis, and the fact that the number of statistical dimensions in the combined THES&SJTU framework is twice the number of statistical dimensions for either the THES or the SJTU (result of factor analysis), more diverse aspects of universities are captured if all twelve indicators are considered. The linkages between the THES indicators on one side and the SJTU indicators on the other are positive and significant, yet fair ( r = 0.58). This result evidences the relatively low degree of overlap of information between the two sets and suggests that an eventual merging of the twelve indicators may provide a more holistic picture of the universities performance.
Even if all three previous recommendations are taken into consideration, one further issue remains: the high volatility of more than half of the universities we analysed. We recall the reader that these universities are considered the “elite” of the thousands of universities world-wide. If the ranks of those universities are full of uncertainty, let alone the ranks of the universities further down the classification ladder. This high volatility calls for a revision of the set of indicators, either by enriching it with other dimensions that are crucial to assessing university performance or by revising some of the existing indicators in order to remove some of the bias present (e.g., eliminate bias in favour of old and/or big universities and/or hard sciences). A legitimate question is raised: when will the revision of the dataset of indicators reach a satisfactory level? Uncertainty and sensitivity analysis should be employed as a guide to determine when the revision process of the indicators has reached a satisfactory level. We would argue that the stopping criterion for the revision is reached when, upon acknowledging the methodological uncertainties that are intrinsic to the development of a ranking system, the space of inference of the ranks for the majority of the universities is narrow enough to justify a meaningful classification.