Bibliometrically disciplined peer review: on using indicators in research evaluation

Peter van den Besselaar, Ulf Sandström Evaluation of research uses peer review and bibliometrics, and the debate about their balance in research evaluation continues. Both approaches have supporters, and both approaches are criticized. In this paper, we describe an interesting case in which the use of bibliometrics in a panel-based evaluation of a mid-sized university was systematically tried out. The case suggests a useful way in which bibliometric indicators can be used to inform and improve peer review and panel-based evaluation. We call this ‘disciplined peer review’, and disciplined is used here in a constructive way: Bibliometrically disciplined peer review is more likely to avoid the subjectivity that often influences the outcomes of the peer and panel review-based evaluation.

Recognition through performance and reputation

Peter van den Besselaar, Ulf Sandström, Charlie Mom As the various disciplines have different forms of social and intellectual organization (Whitley 2000), scholars in various fields may depend less on their peers, and more on other audiences for recognition and funding. Following Merton (1973) we distinguish between performance and reputation for building up recognition. We show that there are indeed differences between the disciplines: in life sciences and social sciences, the reputation related indicators are dominant in predicting the score that grants applicants get from the panel, whereas in the natural sciences, the performance-related indicators dominate the panel scores. Furthermore, when comparing within the life sciences the grantees with the best performing non-grantees, we show that the former score higher on the reputation indicators and the second score better on the performance variables, supporting the findings that in life sciences one probably gain recognition over reputation more than over individual performance. We suggest that this may not be optimal for the growth of knowledge.

The P-model: An Indicator that Accounts for Field Adjusted Production as well as Field Normalized Citation Impact

Erik Sandström, Ulf Sandström, Peter van den Besselaar Any type of scientific study or evaluation of research quality and impact enters into two types of problems if there is more than one topic area involved in the study: (1) How to account for differences in (paper) production? (2) How to account for differences in citation impact, i.e. influence over subsequent literature? This paper aims to show that these questions can be answered with the help of two methods; the Field Adjusted Production (FAP) indicator and a percentile indicator which is designed to include the FAP. Consequently, they are used in combination in order to express a score that includes both paper production an impact into one figure. Thereby is constructed a score that can be used for ranking of universities, departments, individuals. The paper first explains the background of the method, and then how to calculate the indicators belonging to the P-Model. Then the paper indicates some examples and will discuss methods for validation of the proposed indicator.


Peter van den Besselaar, Ulf Sandström It is often argued that the presence of stakeholders in review panels may improve the selection of societal relevant research projects. In this paper, we investigate whether the composition of panels indeed matters. More precisely, when stakeholders are in the panel, does that result in more positive evaluation of proposals of relevance to that stakeholder? We investigate this for the gender issues domain, and show that this is the case. When stakeholders are present, the relevant projects obtain a more positive evaluation and consequently a higher score. If these findings can be generalised, they are an important insight for the creation of pathways to and conditions for impact.

Measuring researcher independence using bibliometric data: A proposal for a new performance indicator

Peter van den Besselaar, Ulf Sandström Bibliometric indicators are increasingly used to evaluate individual scientists–as is exemplified by the popularity of the many other publication and citation-based indicators used in evaluation. These indicators, however, cover at best some of the quality dimensions relevant for assessing a researcher: productivity and impact. At the same time, research quality has more dimensions than productivity and impact alone. As current bibliometric indicators are not covering various important quality dimensions, we here contribute to developing better indicators for those quality dimensions not yet addressed. One of the quality dimensions lacking valid indicators is an individual researcher’s independence. We propose indicators to measure different aspects of independence: two assessing whether a researcher has developed an own collaboration network and two others assessing the level of thematic independence. Taken together they form an independence indicator. We illustrate how these indicators distinguish between researchers that are equally productive and have a considerable impact. The independence indicator is a step forward in evaluating individual scholarly quality.

Studying grant decision-making: a linguistic analysis of review reports

Peter van den Besselaar, Ulf Sandström Peer and panel review are the dominant forms of grant decision-making, despite its serious weaknesses as shown by many studies. This paper contributes to the understanding of the grant selection process through a linguistic analysis of the review reports. We reconstruct in that way several aspects of the evaluation and selection process: what dimensions of the proposal are discussed during the process and how, and what distinguishes between the successful and non-successful applications? We combine the linguistic findings with interviews with panel members and with bibliometric performance scores of applicants. The former gives the context, and the latter helps to interpret the linguistic findings. The analysis shows that the performance of the applicant and the content of the proposed study are assessed with the same categories, suggesting that the panelists actually do not make a difference between past performance and promising new research ideas. The analysis also suggests that the panels focus on rejecting the applications by searching for weak points, and not on finding the high-risk/high-gain groundbreaking ideas that may be in the proposal. This may easily result in sub-optimal selections, in low predictive validity, and in bias. Keywords Peer review Panel review Research grants Decision-making Linguistics LIWC European Research Council (ERC)

Bra forskning är jämnt fördelad över lärosäten

Ulf Sandström Många tar för givet att forskning med svagt genomslag är koncentrerad till vissa småskaliga universitet högskolor. Detta motiverar närmare undersökning eftersom det förhållandet att svensk forskning i väsentlig grad skulle förbättras om verksamheten flyttades från de regionala högskolorna till universiteten behöver i så fall beläggas med fakta. Om det är så att de stora universiteten dragit ifrån och gör bättre resultat än vad som framgick av en tidigare undersökning (Sandström 2015) borde detta kunna förklaras av att forskningsresurserna kanaliserats till dessa lärosäten. Men frågan är hur det egentligen ser ut? Har de stora dragit ifrån och har de små förlorat i samma mån?

Bibliometrisk rapport: Naturvårdsverkets viltforskning 2003–2014

Ulf Sandström This bibliometric evaluation of wildlife research, funded by the Wildlife Management Fund through the Swedish Environmental Protection Agency (SEPA) during 2003–2014, highlights how the international publications have developed for the funded research leaders and co-applicants during the period 2006 until 2014. The following questions have guided the evaluation: 1) Has the SEPA programme for wildlife research payed off in relation to input of resources? 2) Has SEPA and its Wildlife Research Committee chosen the best available researchers for the projects? 3) Does SEPA’s funded wildlife research represent a reasonable project portfolio in an international perspective? 4) Does SEPA have a gender-wise equal distribution of research funds? Nearly 95% of all resources have gone to sub-programmes devoted to large carnivores, general biology and social science/humanities. Those areas that have received most of the resources can therefore have dedicated researchers, where most of their publications have focused on the game programme, the other and the smaller areas more or less fall outside. Within the aforementioned areas, game research has yielded good results. The bibliometric evaluation suggests that the SEPA has a good exchange of resources in terms of number of articles and expected citation response from the larger research community. Particularly the programme for large carnivores has proved to be an investment with good productivity and substantial recognition from the international research community. During the programme period, citation strength increases significantly, from 40% to 60% of researchers have strong achievements, i.e. they are included in the top 20% of Swedish researchers.

A comparative analysis of the publication behaviour of MSCA fellows

Koen Jonkers, Ulf Sandström, Peter van den Besselaar The Marie Sklodowska Curie Action (MSCA) fellowship scheme aims, as a part of the European framework programmes, to promote scientific excellence, mobility and research collaboration in the European Research Area. As most elements on the EU Framework Programmes, it also aims to widen capacity development throughout the EU in Member States with different levels of scientific development. This report analyses the mobility, publication and international co-publication behaviour of a group of European researchers that have taken part in the Marie Sklodowska Curie Action (MSCA) Fellowship schemes. It compares researchers that received their PhD from organisations in two groups of countries before and after being granted the fellowship. The first group of countries (from North-Western Europe: FPIC receives a relatively large share of their research funding budget from the European Framework Programmes and a relatively low share from the European Structural and Investment Funds. The second group of countries (from South and Eastern European: ESIFIC) presents a lower Framework Programme funding intensity but the Funding intensity of the European Structural and Investment Funds is higher. The funding intensity levels associated with these broad programmes are taken as an indication of the level of scientific development. It strongly correlates with the average impact of the publications made by researchers in these countries. Also relevant to this analysis is that the first group of countries tend to host more MSCA fellows than they send whereas the reverse holds for the second group group. The analysis measures performance as the sum of the citation impact of a researchers publications. Before the grant one observes a difference between the performance of applicants from South and Eastern Europe (ESIFIC) on the one hand and those from North Western Europe (FPIC) on the other. Over time the median performance gap disappears: there is convergence in the median performance of researchers from the two country groups. However due to a larger number of outliers (top performers) in North Western European countries there remains a difference in the average performance. When comparing MSCA applicants with other grant schemes, one finds that the MSCA applicants perform well before and after the grant - though as expected below the performance of researchers funded by the highly selective ERC junior grant which tend to be more senior. The MSCA applicants show a marked improvement after the grant in comparison to before. This in contrast to a similar national individual fellowship in an EU MS. Post grant performance is mainly correlated to pre-grant performance. One does not find a significant correlation with the quality of the research environment (as proxied by citation impact of the host organisation). This is surprising because the quality of the host environment is an explicit selection criterium. Post grant international collaboration behaviour is mainly correlated to pre-grant international collaboration: it appears as if the well connected remain well connected also after being funded. What we did find was that after the grant a considerable share of the increase in co-authored high impact papers are co-published with researchers from North Western Europe: this suggests the MSCA mobility experience leads to productive research links. The potential for robust evaluations, either in the form of counterfactual analyses or randomised controlled experiments should be taken into account at the planning and implementation phase of the Framework Programmes.

Funding, evaluation, and the performance of nationalresearch systems (orig article)

Ulf Sandström, Peter van den Besselaar Understanding the quality of science systems requires international comparative studies, which are difficult because of the lack of comparable data especially about inputs in research. In this study, we deploy an approach based on reasonable comparative data that focus on change instead of on levels of inputs and outputs, as this approach to a large extent eliminates the problem of measurement differences between countries. Using input-data related to output data (top publications in Web of Science) we first show which national science systems are more efficient (where performance increase is stronger than expected change in funding) and systems which are less efficient. We then discuss our findings using popular explanations of performance differences: differences in the level of competition, differences in the level of university autonomy, and differences in the level of academic freedom. Interestingly, the available data do not support the common explanations. Good functioning systems are characterized by a well-developed ex post evaluation system combined with considerably high institutional funding and low university autonomy (meaning a high autonomy of professionals). On the other hand, the less efficient systems have a strong ex ante control, either through a high level of so-called competitive project funding, or through strong power of the university management.