Funding, evaluation, and the performance of national research systems

Ulf Sandström, Peter van den Besselaar Understanding the quality of science systems requires international comparative studies, which are difficult because of the lack of comparable data especially about inputs in research. In this study, we deploy an approach based on reasonable comparative data that focus on change instead of on levels of inputs and outputs, as this approach to a large extent eliminates the problem of measurement differences between countries. Using input-data related to output data (top publications in Web of Science) we first show which national science systems are more efficient (where performance increase is stronger than expected change in funding) and systems which are less efficient. We then discuss our findings using popular explanations of performance differences: differences in the level of competition, differences in the level of university autonomy, and differences in the level of academic freedom. Interestingly, the available data do not support the common explanations. Good functioning systems are characterized by a well-developed ex post evaluation system combined with considerably high institutional funding and low university autonomy (meaning a high autonomy of professionals). On the other hand, the less efficient systems have a strong ex ante control, either through a high level of so-called competitive project funding, or through strong power of the university management.

Counterintuitive effects of incentives?

Peter van den Besselaar, Ulf Sandström A recent paper in this journal compares the Norwegian model of using publications counts for university funding with a similar intervention in Australia in the mid-1990 s. The authors argue that the Norwegian model (taking into account the quality of publications) performs better than the Australian (which did neglect paper quality other than being peer reviewed). We argue that these conclusions are in contrast to the evidence provided in the article, and therefore should be considered incorrect.

Influence of cognitive distance on grant decisions

Ulf Sandstrom, Peter van den Besselaar The selection of grant applications generally is based on peer and panel review, but as shown in many studies, the outcome of this process does not only depend on the scientific merit or excellence, but also on social factors, and on the way the decision-making process is organized. A major criticism on the peer review process is that it is inherently conservative, with panel members inclined to select applications that are line with their own theoretical perspective. In this paper we define 'cognitive distance' and operationalize it. We apply the concept, and investigate whether it influences the probability to get funded. Influence of cognitive distance on grant decisions. Available from: [accessed Sep 11, 2017].

Vicious circles of gender bias, lower positions, and lower performance: Gender differences in scholarly productivity and impact

Ulf Sandström, Peter van den Besselaar It is often argued that female researchers publish on average less than male researchers do, but male and female authored papers have an equal impact. In this paper we try to better understand this phenomenon by (i) comparing the share of male and female researchers within different productivity classes, and (ii) by comparing productivity whereas controlling for a series of relevant covariates. The study is based on a disambiguated Swedish author dataset, consisting of 47,000 researchers and their WoS-publications during the period of 2008-2011 with citations until 2015. As the analysis shows, in order to have impact quantity does make a difference for male and female researchers alike—but women are vastly underrepresented in the group of most productive researchers. We discuss and test several possible explanations of this finding, using a data on personal characteristics from several Swedish universities. Gender differences in age, authorship position, and academic rank do explain quite a part of the productivity differences.

Mellan politik och forskning: Byggforskningsrådet 1960-1992

Ulf Sandström Denna bok skildrar Byggforskningsrådet som organisation och forskningsfinansierande organ i relation till såväl forskningspolitiska som allmänpolitiska frågeställningar under perioden 1960-1992. Sex år efter bokens utgivning gick BFR i graven staten lade om den svenska forsk-ningsorganisationen och avslutade den traditionella sektorsforskningspolitiken. I föreliggande upplaga har vissa avsnitt som bedömts vara av mindre intresse tagits bort samt smärre språk-liga justeringar genomförts. (1) Mellan politik och forskning: Byggforskningsrådet 1960-1992. Available from: [accessed Sep 11, 2017].

Perverse effects of outbased research funding?

Peter van den Besselaar, Ulf Sandström, Ulf Heyman More than ten years ago, Linda Butler (2003a) published a well-cited article claiming that the Australian science policy in the early 1990s made a mistake by introducing output based funding. According to Butler, the policy stimulated researchers to publish more but at the same time less good papers, resulting in lower total impact of Australian research compared to other countries. We redo and extend the analysis using longer time series, and show that Butlers’ main conclusions are not correct. We conclude in this paper (i) that the currently available data reject Butler’s claim that “journal publication productivity has increased significantly… but its impact has declined”, and (ii) that it is hard to find such evidence also with a reconstruction of her data. On the contrary, after implementing evaluation systems and performance based funding, Australia not only improved its share of research output but also increased research quality, implying that total impact was greatly increased. Our findings show that if output based research funding has an effect on research quality, it is positive and not negative. This finding has implications for the discussions about research evaluation and about assumed perverse effects of incentives, as in those debates the Australian case plays a major role.

Do observations have any role in science policy studies? A reply

Peter van den Besselaar, Ulf Heyman, Ulf Sandström In Van den Besselaar et al. (2017) we tested the claim of Linda Butler (2003) that funding systems based on output counts have a negative effect on impact as well as quality. Using new data and improved indicators, we indeed reject the claim of Butler. The impact of Australian research improved after the introduction of such a system, and did not decline as Butler states. In their comments on our findings, Linda Butler, Jochen Gläser, Kaare Aagaard & Jesper Schneider, Ben Martin, and Diana Hicks put forward a lot of arguments, but do not dispute our basic finding: citation impact of Australian research went up, immediately after the output based performance system was introduced. It is important to test the findings of Butler about Australia – as these findings are part of the accepted knowledge in the field, heavily cited, often used in policy reports, but hardly confirmed in other studies. We found that the conclusions of Butler are wrong, and that many of the policy implications based on it simply are unfounded. In our study, we used better indicators, and a similar causality concept as our opponents. And our findings are independent of the exact timing of the policy intervention. Furthermore, our commenters have not addressed our main conclusions at all, and some even claim that observations do not really matter in the social sciences. We find this position problematic − why would the taxpayer fund science policy studies, if it is merely about opinions? Let’s take science seriously − including our own field.

Quantity and/or Quality? The Importance of Publishing Many Papers

Ulf Sandström, Peter van den Besselaar Do highly productive researchers have significantly higher probability to produce top cited papers? Or do high productive researchers mainly produce a sea of irrelevant papers—in other words do we find a diminishing marginal result from productivity? The answer on these questions is important, as it may help to answer the question of whether the increased competition and increased use of indicators for research evaluation and accountability focus has perverse effects or not. We use a Swedish author disambiguated dataset consisting of 48.000 researchers and their WoS-publications during the period of 2008–2011 with citations until 2014 to investigate the relation between productivity and production of highly cited papers. As the analysis shows, quantity does make a difference.

Towards Field Adjusted Production: estimating research productivity from a zero-truncated distribution

Timo Koski, Erik Sandström, Ulf Sandstrom Measures of research productivity (e.g. peer reviewed papers per researcher) is a fundamental part of bibliometric studies, but is often restricted by the properties of the data available. This paper addresses that fundamental issue and presents a detailed method for estimation of productivity (peer reviewed papers per researcher) based on data available in bibliographic databases (e.g. Web of Science and Scopus). The method can, for example, be used to estimate average productivity in different fields, and such field reference values can be used to produce field adjusted production values. Being able to produce such field adjusted production values could dramatically increase the relevance of bibliometric rankings and other bibliometric performance indicators. The results indicate that the estimations are reasonably stable given a sufficiently large data set. Waring distribution, Productivity, Citation Analysis, Ranking, Research Policy

What is the Required Level of Data Cleaning? A Research Evaluation Case

Peter van den Besselaar, Ulf Sandström Bibliometric methods depend heavily on the quality of data, and cleaning and disambiguating data are very timeconsuming. Therefore, quite some effort is devoted to the development of better and faster tools for disambiguating of the data (e.g., Gurney et al. 2012). Parallel to this, one may ask to what extent data cleaning is needed, given the intended use of the data. To what extent is there a trade-off between the type of questions asked and the level of cleaning and disambiguating required? When evaluating individuals, a very high level of data cleaning is required, but for other types of research questions, one may accept certain levels of error, as long as these errors do not correlate with the variables under study. In this paper, we present an earlier case study with a rather crude way of data handling as it was expected that the unavoidable error would even out. In this paper, we do a sophisticated data cleaning and disambiguation of the same dataset, and then do the same analysis as before. We compare the results and discuss conclusions about required data cleaning. Keywords: Coupling data sets, Data cleaning disambiguation, Data error.