Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A measure for the impact of research

A measure for the impact of research Alejandro M. Arago´n SUBJECT AREAS: School of Architecture, Civil and Environmental Engineering (ENAC), Ecole polytechnique fe ´de ´rale de Lausanne (EPFL), CH-1015 STATISTICS Lausanne, Switzerland. COMPUTATIONAL SCIENCE APPLIED MATHEMATICS The last few years have seen the proliferation of measures that quantify the scientific output of researchers. SCIENTIFIC DATA Yet, most of these measures focus on productivity, thus fostering the ‘‘publish or perish’’ paradigm. This article proposes a measure that aims at quantifying the impact of research de-emphasizing productivity, thus providing scientists an alternative, conceivably fairer, evaluation of their work. The measure builds Received from a published manuscript, the literature’s most basic building block. The impact of an article is defined as the number of lead authors that have been influenced by it. Thus, the measure aims at quantifying the 11 March 2013 manuscript’s reach, putting emphasis on scientists rather than on raw citations. The measure is then Accepted extrapolated to researchers and institutions. 27 March 2013 Published 1 he exponentially increasing number of publications , makes it increasingly hard for researchers to keep up 11 April 2013 with the literature. The problem of paper inflation presents newcomers the even more challenging exercise of finding those works that have made significant contributions, knowledge that researchers accumulate over years of experience. Part of the problem, noted almost half a century ago , stems from a structure of science that favors productivity. The current system puts pressure on researchers for the publication of scientific articles, for Correspondence and this is their only way to justify the received funding. The publication of results, states Margolis , ‘‘is probably the requests for materials main means of accomplishing the almost impossible task of accounting for time and money spent on research. should be addressed to Inevitably, this puts a premium on quantity at the expense of quality, and, as with any other type of inflation, the A.M.A. problem worsens: the more papers are written, the less they account for and the greater is the pressure to publish . (alejandro.aragon@ more.’’ Thus, the publish or perish paradigm that besets scientists has, for the most part, inevitable implications on epfl.ch); the quality of research published. At worst, many authors opt for publishing the same data, or even for the (alejandro.aragon@ minimum publishable unit (MPU), in order to increase the apparent value of their work . In the latter, research fulbrightmail.org) findings are divided into a series of articles instead of producing a single meaningful manuscript. This modus operandi has further implications, for the slight contribution to science of an MPU presents editors of journals the very subjective task of discerning if it even deserves publication. The problem of paper inflation may eventually place a tremendous burden on the entire scientific community in order to keep the infrastructure for the publication and dissemination of scientific works. The problem has only been boosted by the appearance of recent quantitative measures that favor this pro- ductivity ideology. ‘‘A scientist’’, states Hirsch , ‘‘has index h if h of his or her N papers have at least h citations each and the other (N – h) papers have # h citations each’’. The h index has been widely adopted since its introduction in 2005 by major online databases , as it improves over former measures for the quantification of a researcher’s performance. Its simplistic computation is perhaps the responsible for its wide adoption, and Hirsch suggests that the index could even be used by decision makers for the granting of funding, tenure, promotion, and 4,6 7 awards . While researchers are concerned about important decisions being made solely based on such measures , the application of the index has even been extended to the productivity assessment of entire research groups, 8 9 institutions and countries . It has also been shown that the index is able to predict future scientific achievement . On the downside, the measure carries with it many drawbacks as it appears from the many variants that emerged in the literature. One major criticism to the h index is its inability to compare scientists from different fields, for the citation practices vary significantly per discipline. Radicchi et al. have proposed a variant of the h index that takes this variability into account for a fairer comparison across fields, to the expense of a more involved calculation. In a recent publication , Bornmann shows that most of other 37 variants of the h index add no significant information. Another problem with the h index that has been observed, is that contrary to what its 12,13 14 proponents may claim, it is not hard to manipulate its value in one’s favor . Lehmann et al. also show that, while the h index ‘‘attempts to strike a balance between productivity and quality’’, the mean number of citations per paper is a better measure of research quality. 15,16 Measures that are based on citations will carry over their intrinsic problems . Seglen states that ‘‘citations represent a measure of utility rather than of quality’’, and gives a succinct summary of their main problems .In SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 1 www.nature.com/scientificreports spite of current efforts that aim at measuring the outcome of scient- a total of 19629 citing documents, so the impact value above is a ific research not solely based on publications and citations , the partial estimate. latter remain for now the only means to quantify the impact of a Figures 1 and 2 show cumulative impact results, in semi-log plots, for distinguished scholars in physics and mathematics, respectively. scientific article. According to Martin , ‘‘the impact of a publication describes its actual influence on surrounding research activities at a Nobel laureates since 2007 are displayed in Figure 1 (except those from 2009 due to lack of data in the database), where impact values given time. While this will depend partly on its importance, it may were obtained with data up to the year of their corresponding price as also be affected by such factors as the location of the author, and a means to only survey their research impact. The figure shows a clear the prestige, language, and availability, of the publishing journal.’’ In trend of impact increase over the years, with the sudden increase in a subsequent publication , he states that ‘‘citation counts are an the first portion of the curves suggesting the lack of data prior to indicator more of impact than of quality or importance’’. Thus, cita- 1996. It is remarkable the curves of professors Andre Geim and tions in this article are implicitly assumed to measure impact, as to Konstantin Novoselov, whose work on the graphene material has avoid criticism over their relation to both research importance and had an enormous impact over the last few years. The impact of quality . Fields medalists does not seem to be largely affected by the receipt This article provides an alternative measure of scientific achieve- of the price, and thus their curves in Figure 2 are computed from data ment based on citations, but their negative effects are reduced to a up to 2012. From this figure it is also noticeable the impact of Prof. minimum, thus yielding a measure that cannot be easily manipu- Terence Tao, who has an order of magnitude difference with respect lated. The proposition does not aim at replacing current measures of to his contemporaneous mathematicians. productivity, but to complement them in order to provide the A department of the University of Illinois at Urbana-Champaign, research community with an alternative evaluation of its scientific the author’s alma matter, is considered now for the survey of an production. To measure the quality of scientific output, Bornmann 8 institution’s impact. More precisely, the Aerospace Engineering and Daniel claim , ‘‘it would therefore be sufficient to use just two Department is examined. With a total of 23 actively working profes- indices: one that measures productivity and one that measures sors, the department’s impact is I 5 7592. This value takes into impact.’’ With the suggested measure it may be possible for a scientist account the impact overlap that exists among the institution’s scho- to have considerable impact even if publishing a single article, con- lars, i.e., the fact that they can have an impact on the same research- trary to what the h index and similar measures would suggest. As ers. For the department the impact overlap is V < 18%. As long as mentioned earlier, the measure is defined first for a scientific manu- there is a non-zero value of impact overlap between the scientists of script in an attempt to quantify its reach, and hence scientists are at an institution, the impact value will be smaller than the sum of their the very core of its definition. Similarly, the impact of a scientist aims impacts, which in this case amounts to S i 5 9439. The department i i at determining the number of researchers that have been influenced has 6 assistant, 3 associate, and 14 full professors, and thus the same by his or her work. The measure is further extrapolated so that the analysis can be made regarding their hierarchical positions within impact of an institution comprises the impact of its body of research- the institution. The impact of assistant, associate, and full professors ers. The impact measures for manuscripts, scientists, and institu- are I 5 398, I 5 443, and I 5 7018. Their corresponding impact a A F tions, whose formal definitions are given in the Methods section, overlap values are V < 1.3%, V < 0.7%, and V < 17%. For the a A F are referred henceforth as W, i, and I, respectively. formal definitions of the quantities used, the reader is referred to the Methods section. Results For the computation of scientist impact, citation records are taken Discussion from the SciVerse Scopus database. Considering that this database Instead of the total number of citations, which has been traditionally lists records since 1996, impact values for scientists who have pub- used as a measure of the impact of an article , the proposed measure lished earlier represent partial estimates. In addition, the number of W aims at discerning the genuine number of people the paper has had citing documents are also obtained for some authors from the Web of an impact upon. In other words, W aims at measuring the manu- Science citation index provided by Thomson Reuters. script’s reach. Implicit to the definition of impact is the most basic The strengths of the proposed impact measure are evident when assumption of citation analysis, that ‘‘references cited by an author applied to eminent scientists who have not published considerably, are a roughly valid indicator of influence on his work’’, in the words but who have nevertheless produced substantial contributions to of McRoberts . Yet, citations are used in a way that attempts the science. For instance, Kary B. Mullis won the Nobel price in chem- mitigation of any deviations from this assumption. istry in 1993 due to his contributions within the field of DNA-chem- The impact of a manuscript W is defined in a way that excludes istry research. From a total of 8063 citing documents obtained from self-citations of any kind. Self-citations, states Schreiber , ‘‘do not the Scopus database over a period of 18 years, it is found that 6435 of reflect the impact of the publication and therefore ideally the self- those records (roughly 80%) have a different first author who do not citations should not be included in any measure which attempts to share authorship with Mullis. Consequently, with a modest h index estimate the visibility or impact of a scientist’s research.’’ Since only of 15, Mullis has had an impact i5 6435 over that period of time. Yet, first authors are taken into account, W establishes a lower bound on as mentioned before, this is a partial estimate and the true impact is the actual number of scientists that are influenced by the article. expected to be much higher. The Web of Science lists a total of 24750 One could argue that an article’s impact should take into consid- citing documents for this author, which is approximately 3 times the eration not only first authors but also the entire authorship of citing number of records used to compute i. To put the impact of Mullis papers, but in that case the measure could easily be over-bloated. into perspective, seven authors taken from those 8063 citing docu- There are fields where an article’s authorship contains even thou- ments were found to have the same h index. Yet, their impact values sands of names, so their inclusion would be not only superfluous but were {346, 404, 553, 555, 561, 680, 1284}, where the highest value is also meaningless. More often than not, first authors write the papers, scarcely 20% of that of Mullis. Another example is the late Richard P. and are responsible to give credit to those who have influenced their Feynman (h 5 37), who won the Nobel price in physics in 1965 for work. his work on quantum electrodynamics. With a total of 8123 citing The computation of impact values for manuscripts may help cope documents obtained from the Scopus database, an impact i5 5175 is with the aforementioned problem of distinguishing those that have assigned to Feynman. In this case only about 64% of the citing docu- influenced the most people in their corresponding field. The impact ments counted towards his impact. As before, the Web of Science lists of a manuscript has a monotonically increasing nature, but its SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 2 www.nature.com/scientificreports Figure 1 | Cumulative impact values for Nobel laureates in physics since 2007. computation over a time period could give an indication of the there is no current way to determine a fair value for a weight of a diffusion of the article in its field. Additionally, it may be expected journal. Impact factors and similar measures should by no means be that the impact follows the same rate of growth as that of the scientific used as weights, as they are not representative of all manuscripts in a 22,1 literature, which has been shown to grow exponentially . Thus, journal. It has been established that usually a very small number of non-contemporaneous manuscripts of the same relative importance publications are the responsible for the high impact factors of some may have very different values of W after t years. Still, this does not journals, whereas the majority of manuscripts receive little cita- 17,23 void its definition, for the newer article would still reach more people. tions . Conversely, the prestige of the journal where the article is Also, even though it is expected that the impact value is mostly published is already implicit in the very definition of impact, insofar obtained from positive citations, the measure carries no information as publishing in a reputable journal may provide higher visibility. As about discrediting citations. And yet, even negative citations have a result, it may be expected that an article published in a prestigious influenced their authors, and thus this lack of information is imma- journal would reach more people, and would therefore have a higher terial to the definition of impact. impact value. The impact for a manuscript bears no information about the repu- Regarding the impact of a scientist i, one immediately remarks that tation of the journal of its publication. Some may argue that the it is not only obtained from articles where the scholar is the lead manuscript impact should be multiplied by a factor that takes this author. The proposed asymmetry stems from the way scientific arti- fact into consideration in order to separate the wheat from the chaff. cles are written. Even though it is the first author who gets the most Yet, in the view of the author this is not necessary, and it may even be credit (except in fields that list authors alphabetically), the writing of harmful. The proposition would not only become elitist, but also a scientific paper is not a solo enterprise, and it usually comprises Figure 2 | Cumulative impact values for Fields medalists since 2006. SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 3 www.nature.com/scientificreports numerous contributions among those in the authorship. Even in the effect on i. If nothing else, these networks would promote the appear- case that it is the first author who writes the manuscript, and who ance of new first authors. Also, it has been revealed that there are carries most (if not all) of the work needed to obtains its results, it is scientists who are eager to review articles whose citations would push not uncommon that the main idea that originates the paper in the towards the increase of their h index . Needless to say that the citation first place comes from someone else in the authorship (usually the machinery, referred to the addition of citations for ‘‘calling the atten- 26,22 student’s advisor or manager of a project). Since it is impossible to tion or gaining the favor of editors, referees or colleagues’’ , cannot discern credit from the authorship, the impact must account equally be avoided (see also ). But this is so for any type of measure based on for all authors. Broad comments on a study where the scientists’ citations. Finally, at worst the impact measure may have major impli- judgements about their contributions in the research team summed cations on researchers that try to forge a career out of non-ethical up to a total of 300%. Besides, failing to give equal credit to all authors work. For example, the plagiarism of articles and its publication in could prompt the researchers of higher-rank to take over first author- little known journals would have little to no effect on the impact of a ship, an unfortunate situation for the student. Defined this way, the researcher . impact can also be applied to those fields that use the Hardy- With respect to the impact of an institution I, departments would Littlewood rule to list authors. In spite of this, impact values in these look into hiring faculty that would explore a different area to those fields are expected to be lower than if authors were not listed in already present. This is already implicit in hiring committees, as it alphabetical order. makes no sense to recruit faculty whose work would overlap com- The proposed measure i establishes a lower bound on the direct pletely with existing professors, and who would therefore reach the impact of a scientist. However, if scientist B cites an idea from sci- same research community. By means of Eqn. (1), however, research entist A, the latter gets no credit by a scientist C who cites B for the institutions have a way of quantifying the added impact of prospect- idea. This is still congruent with the goals of establishing a lower ive faculty. The institutions can also measure their impact overlap, bound, as scientist C never had to be exposed to the work of scientist given by Eqn. (2). The closer this value to zero, the more independent A. Still, the measure could be combined with, e.g., network models the fields of study among the institution’s scientists. Yet, it is sup- or with the modern PageRank citation ranking algorithm to pro- posed that some degree of overlap is not only desirable, but also vide further insight on both direct and indirect impact. But the inevitable, for its direct implication on the collaborations among additional information may come at the expense of losing the the scientists within the institution. impact’s direct interpretation, and its computation would no longer In this day and age, all kinds of measures abound and will live on be simple. whether we like it or not. In spite of that, it is only fair to quantify Another remark that can be made about a scientist’s impact i is scientific output not solely with measures that favor productivity. If that addition is not used to include citing first authors already taken the research of a scientist has a true impact, it should be feasible to into account in previous works, and with valid argument. In the view measure it, even if that research is contained in a single publication. of the author, when addition is used as a means to quantify impact, The suggested methodology to measure impact goes against the pub- the given measure can be abused as it promotes quantity over quality. lish or perish dogma of modern science, putting forward an alterna- Therefore, addition should only be used when measuring productiv- tive ideology to quantify scientific achievement. If taken seriously, ity, which is not the objective of the proposed measure. Although it 7 and scientists do take measures seriously , the proposed measure could be argued that the impact i is unfair to scientists who inspire may help cope with some of the problems caused by the modern the same scientific community with new ideas, there are several structure of science. advantages that more than compensate for this drawback. First, the direct interpretation of the impact i, for it gives a realistic Methods quantity on the number of people influenced by the research of a Impact of a scientific manuscript. In the following definitions, bold and non-bold scientist. Second, an immediate consequence of the first point is that greek letters are used to represent sets, and elements of sets, respectively. The when the proposed measure is applied to eminent scientists, the proposed measure builds from the impact of a single scientific manuscript m, whose impact values can give a rough indication of the sizes of their fields author set is denoted a (m). This set may contain a single author, in which case the set is a singleton. Let P 5 {p , p ,…, p } be a set of m citing articles that do not have an of research. Third, its computation is straightforward as it only 1 2 m S T author in a (m), expressed mathematically by aðÞ p aðÞ m ~1. In other requires information about the authorship of citing papers. Fourth i~1 words, none of the authors of articles in P can be found to be an author of m. and most important, the measure promotes quality over quantity. Furthermore, let Q ; Q(p )g a(p ) represent the lead or first author of the ith citing i i i Authors gain nothing by either dividing their work, or by publishing paper. Thus the set of all lead author scientists that cite m is given byW~ QðÞ p . i~1 the same data over and over, if they are to be cited by the same group Bear in mind that sets do not contain duplicates. The impact of a manuscript is defined as W ; jWj, i.e., the cardinality of set W. Note that W belongs to the set of of researchers. On the contrary, it is likely that a single meaningful natural numbers, i.e., W [ N . work would have more visibility than that of an equivalent series of smaller manuscripts. Longer articles, states Laband , ‘‘receive more Impact of a scientist. Let i~ W denote the combined lead author set over the n 22 j~1 citations than shorter ones’’ (see also Bornmann and Daniel , and manuscripts written by the scientist, whereW refers to the lead author set (as defined references therein). Fifth, the impact encourages innovation.If above) for the jth article. The impact of a scientist i is then defined as the cardinality of this set, i.e., i:jj i [ N . In words, the impact of a scientist comprises the number of all researchers do not increase their impact with consecutive works, it first authors who have cited any of the n published works regardless of the scientist’s means that either their articles are not being cited, or that they are authorship position. Note that the impact of a scientist can be determined by using the cited by the same scientists. On the contrary, a continuous growth of i inclusion-exclusion principle, and that addition is nowhere used. The scientist’s may suggest that more and more scientists get engaged in their work. average impact per article is i/n. In order to increase their impact, scientists may look into exploring other areas within or even outside their field of expertise, encour- Impact of an institution. By extension, the impact of an institution I is determined by the impact of its body of scientists. Given an institution with s scientists, its impact is aging multidisciplinary research and collaborations. Sixth, the thus defined using the inclusion-exclusion principle impact may promote the evaluation of researchers not solely based on productivity. The impact is introduced not to compete with the h I~ i , ð1Þ index or its variants, but to complement them. Seventh, the impact k~1 cannot be manipulated easily towards one’s advantage, as its very definition excludes self-citations of any kind. Furthermore, networks where i denotes the combined lead author set of scientist k. Again, no summation is of researchers that cross-cite their works in order to increase their involved in the definition. The institution’s average impact per researcher is then 15,21 citation count , practice known as cronyism, would have little given by I/s. An institution can measure its impact overlap as SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 4 www.nature.com/scientificreports 14. Lehmann, S., Jackson, A. D. & Lautrup, B. E. Measures for measures. Nature 444, 1003–1004 (2006). V~  i \i : ð2Þ i j i=j 15. Thorne, F. C. The citation index: Another case of spurious validity. Journal of Clinical Psychology 33, 1157–1161 (1977). 16. MacRoberts, M. H. & MacRoberts, B. R. Problems of citation analysis: A critical Algorithm. For the numerical calculations of research impact, it is assumed that there review. Journal of the American Society for Information Science 40, 342–349 are no homographs, i.e., different authors sharing the same written name. This (1989). assumption may result in lower impact values, for those first author scientists who 17. Seglen, P. O. The skewness of science. Journal of the American Society for write their name exactly the same way are counted as one. Yet, some efforts are Information Science 43, 628–638 (1992). underway for the creation of unique researcher identifiers that could easily remove 18,28 18. Lane, J. Let’s make science metrics more scientific. Nature 464, 488–489 (2010). this obstacle . In spite of this, the determination of scientific impact is 19. Martin, B. R. & Irvine, J. Assessing basic research: Some partial indicators of straightforward. Authors who write their names slightly different are detected using scientific progress in radio astronomy. Research Policy 12, 61–90 (1983). the Levenshtein distance, an algorithm that finds dissimilarities between two sequences of characters. When it is recognized that two authors are spelled slightly 20. Martin, B. R. The use of multiple indicators in the assessment of basic research. different, name initials are then compared. If up to this point no assurance about the Scientometrics 36, 343–362 (1996). uniqueness of the author can be made, one of the authors is marked as dubious.For 21. Phelan, T. J. A compendium of issues for citation analysis. Scientometrics 45, the results presented in Figures 1 and 2, dubious authors are but a small percentage of 117–136 (1999). the total number: 1.9% – 3.1% for physicists and 0% – 2.6% for mathematicians. 22. Bornmann, L. & Daniel, H.-D. What do citation counts measure? A review of studies on citing behavior. Journal of Documentation 64, 45–80 (2008). 23. Campbell, P. Escape from the impact factor. Ethics in Science and Environmental 1. Larsen, P. O. & von Ins, M. The rate of growth in scientific publication and the Politics 8, 5–7 (2008). decline in coverage provided by Science Citation index. Scientometrics 84, 24. Page, L., Brin, S., Motwani, R. & Winograd, T. The PageRank citation ranking: 575–603 (2010). Bringing order to the web. Technical Report 1999–66, Stanford InfoLab (1999). 2. Margolis, J. Citation indexing and evaluation of scientific papers. Science 155, 25. Laband, D. N. Is there value-added from the review process in economics?: 1213–1219 (1967). Preliminary evidence from authors. The Quarterly Journal of Economics 105, 3. Broad, W. J. The publishing game: getting more for less. Science 211, 1137–1139 341–352 (1990). (1981). 26. Vinkler, P. A quasi-quantitative citation model. Scientometrics 12, 47–72 (1987). 4. Hirsch, J. E. An index to quantify an individual’s scientific research output. 27. Broad, W. J. Would-be academician pirates papers. Science 208, 1438–1440 Proceedings of the National Academy of Sciences 102, 16569–16572 (2005). (1980). 5. Noorden, R. V. Metrics: A profusion of measures. Nature 465, 864–866 (2010). 28. Editorial. Credit where credit is due. Nature 462, 825 (2009). 6. Hirsch, J. E. An index to quantify an individual’s scientific research output that takes into account the effect of multiple coauthorship. Scientometrics 85, 741–754 (2010). 7. Abbott, A. et al. Metrics: Do metrics matter? Nature 465, 860–862 (2010). Acknowledgements 8. Bornmann, L. & Daniel, H.-D. The state of h index research. Is the h index the ideal The author would like to thank Dr. Vladislav Yastrebov and Prof. Jean-François Molinari way to measure research performance? EMBO Reports 10, 2–6 (2009). 9. Hirsch, J. E. Does the h index have predictive power? Proceedings of the National for their suggestions on the article. Academy of Sciences 104, 19193–19198 (2007). 10. Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: Additional information Toward an objective measure of scientific impact. Proceedings of the National Competing financial interests: The authors declare no competing financial interests. Academy of Sciences 105, 17268–17272 (2008). 11. Bornmann, L., Mutz, R., Hug, S. E. & Daniel, H.-D. A multilevel meta-analysis of License: This work is licensed under a Creative Commons studies reporting correlations between the h index and 37 different h index Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this variants. Journal of Informetrics 5, 346–359 (2011). license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ 12. Schreiber, M. A case study of the Hirsch index for 26 non-prominent physicists. How to cite this article: Aragon, A.M. A measure for the impact of research. Sci. Rep. 3, Annalen der Physik 16, 640–652 (2007). 1649; DOI:10.1038/srep01649 (2013). 13. Bartneck, C. & Kokkelmans, S. Detecting h-index manipulation through self-citation analysis. Scientometrics 87, 85–98 (2011). SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 5 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Scientific Reports Springer Journals

A measure for the impact of research

Scientific Reports , Volume 3 (1) – Apr 11, 2013

Loading next page...
 
/lp/springer-journals/a-measure-for-the-impact-of-research-53VVwEmeIf

References (40)

Publisher
Springer Journals
Copyright
Copyright © 2013 by The Author(s)
Subject
Science, Humanities and Social Sciences, multidisciplinary; Science, Humanities and Social Sciences, multidisciplinary; Science, multidisciplinary
eISSN
2045-2322
DOI
10.1038/srep01649
Publisher site
See Article on Publisher Site

Abstract

Alejandro M. Arago´n SUBJECT AREAS: School of Architecture, Civil and Environmental Engineering (ENAC), Ecole polytechnique fe ´de ´rale de Lausanne (EPFL), CH-1015 STATISTICS Lausanne, Switzerland. COMPUTATIONAL SCIENCE APPLIED MATHEMATICS The last few years have seen the proliferation of measures that quantify the scientific output of researchers. SCIENTIFIC DATA Yet, most of these measures focus on productivity, thus fostering the ‘‘publish or perish’’ paradigm. This article proposes a measure that aims at quantifying the impact of research de-emphasizing productivity, thus providing scientists an alternative, conceivably fairer, evaluation of their work. The measure builds Received from a published manuscript, the literature’s most basic building block. The impact of an article is defined as the number of lead authors that have been influenced by it. Thus, the measure aims at quantifying the 11 March 2013 manuscript’s reach, putting emphasis on scientists rather than on raw citations. The measure is then Accepted extrapolated to researchers and institutions. 27 March 2013 Published 1 he exponentially increasing number of publications , makes it increasingly hard for researchers to keep up 11 April 2013 with the literature. The problem of paper inflation presents newcomers the even more challenging exercise of finding those works that have made significant contributions, knowledge that researchers accumulate over years of experience. Part of the problem, noted almost half a century ago , stems from a structure of science that favors productivity. The current system puts pressure on researchers for the publication of scientific articles, for Correspondence and this is their only way to justify the received funding. The publication of results, states Margolis , ‘‘is probably the requests for materials main means of accomplishing the almost impossible task of accounting for time and money spent on research. should be addressed to Inevitably, this puts a premium on quantity at the expense of quality, and, as with any other type of inflation, the A.M.A. problem worsens: the more papers are written, the less they account for and the greater is the pressure to publish . (alejandro.aragon@ more.’’ Thus, the publish or perish paradigm that besets scientists has, for the most part, inevitable implications on epfl.ch); the quality of research published. At worst, many authors opt for publishing the same data, or even for the (alejandro.aragon@ minimum publishable unit (MPU), in order to increase the apparent value of their work . In the latter, research fulbrightmail.org) findings are divided into a series of articles instead of producing a single meaningful manuscript. This modus operandi has further implications, for the slight contribution to science of an MPU presents editors of journals the very subjective task of discerning if it even deserves publication. The problem of paper inflation may eventually place a tremendous burden on the entire scientific community in order to keep the infrastructure for the publication and dissemination of scientific works. The problem has only been boosted by the appearance of recent quantitative measures that favor this pro- ductivity ideology. ‘‘A scientist’’, states Hirsch , ‘‘has index h if h of his or her N papers have at least h citations each and the other (N – h) papers have # h citations each’’. The h index has been widely adopted since its introduction in 2005 by major online databases , as it improves over former measures for the quantification of a researcher’s performance. Its simplistic computation is perhaps the responsible for its wide adoption, and Hirsch suggests that the index could even be used by decision makers for the granting of funding, tenure, promotion, and 4,6 7 awards . While researchers are concerned about important decisions being made solely based on such measures , the application of the index has even been extended to the productivity assessment of entire research groups, 8 9 institutions and countries . It has also been shown that the index is able to predict future scientific achievement . On the downside, the measure carries with it many drawbacks as it appears from the many variants that emerged in the literature. One major criticism to the h index is its inability to compare scientists from different fields, for the citation practices vary significantly per discipline. Radicchi et al. have proposed a variant of the h index that takes this variability into account for a fairer comparison across fields, to the expense of a more involved calculation. In a recent publication , Bornmann shows that most of other 37 variants of the h index add no significant information. Another problem with the h index that has been observed, is that contrary to what its 12,13 14 proponents may claim, it is not hard to manipulate its value in one’s favor . Lehmann et al. also show that, while the h index ‘‘attempts to strike a balance between productivity and quality’’, the mean number of citations per paper is a better measure of research quality. 15,16 Measures that are based on citations will carry over their intrinsic problems . Seglen states that ‘‘citations represent a measure of utility rather than of quality’’, and gives a succinct summary of their main problems .In SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 1 www.nature.com/scientificreports spite of current efforts that aim at measuring the outcome of scient- a total of 19629 citing documents, so the impact value above is a ific research not solely based on publications and citations , the partial estimate. latter remain for now the only means to quantify the impact of a Figures 1 and 2 show cumulative impact results, in semi-log plots, for distinguished scholars in physics and mathematics, respectively. scientific article. According to Martin , ‘‘the impact of a publication describes its actual influence on surrounding research activities at a Nobel laureates since 2007 are displayed in Figure 1 (except those from 2009 due to lack of data in the database), where impact values given time. While this will depend partly on its importance, it may were obtained with data up to the year of their corresponding price as also be affected by such factors as the location of the author, and a means to only survey their research impact. The figure shows a clear the prestige, language, and availability, of the publishing journal.’’ In trend of impact increase over the years, with the sudden increase in a subsequent publication , he states that ‘‘citation counts are an the first portion of the curves suggesting the lack of data prior to indicator more of impact than of quality or importance’’. Thus, cita- 1996. It is remarkable the curves of professors Andre Geim and tions in this article are implicitly assumed to measure impact, as to Konstantin Novoselov, whose work on the graphene material has avoid criticism over their relation to both research importance and had an enormous impact over the last few years. The impact of quality . Fields medalists does not seem to be largely affected by the receipt This article provides an alternative measure of scientific achieve- of the price, and thus their curves in Figure 2 are computed from data ment based on citations, but their negative effects are reduced to a up to 2012. From this figure it is also noticeable the impact of Prof. minimum, thus yielding a measure that cannot be easily manipu- Terence Tao, who has an order of magnitude difference with respect lated. The proposition does not aim at replacing current measures of to his contemporaneous mathematicians. productivity, but to complement them in order to provide the A department of the University of Illinois at Urbana-Champaign, research community with an alternative evaluation of its scientific the author’s alma matter, is considered now for the survey of an production. To measure the quality of scientific output, Bornmann 8 institution’s impact. More precisely, the Aerospace Engineering and Daniel claim , ‘‘it would therefore be sufficient to use just two Department is examined. With a total of 23 actively working profes- indices: one that measures productivity and one that measures sors, the department’s impact is I 5 7592. This value takes into impact.’’ With the suggested measure it may be possible for a scientist account the impact overlap that exists among the institution’s scho- to have considerable impact even if publishing a single article, con- lars, i.e., the fact that they can have an impact on the same research- trary to what the h index and similar measures would suggest. As ers. For the department the impact overlap is V < 18%. As long as mentioned earlier, the measure is defined first for a scientific manu- there is a non-zero value of impact overlap between the scientists of script in an attempt to quantify its reach, and hence scientists are at an institution, the impact value will be smaller than the sum of their the very core of its definition. Similarly, the impact of a scientist aims impacts, which in this case amounts to S i 5 9439. The department i i at determining the number of researchers that have been influenced has 6 assistant, 3 associate, and 14 full professors, and thus the same by his or her work. The measure is further extrapolated so that the analysis can be made regarding their hierarchical positions within impact of an institution comprises the impact of its body of research- the institution. The impact of assistant, associate, and full professors ers. The impact measures for manuscripts, scientists, and institu- are I 5 398, I 5 443, and I 5 7018. Their corresponding impact a A F tions, whose formal definitions are given in the Methods section, overlap values are V < 1.3%, V < 0.7%, and V < 17%. For the a A F are referred henceforth as W, i, and I, respectively. formal definitions of the quantities used, the reader is referred to the Methods section. Results For the computation of scientist impact, citation records are taken Discussion from the SciVerse Scopus database. Considering that this database Instead of the total number of citations, which has been traditionally lists records since 1996, impact values for scientists who have pub- used as a measure of the impact of an article , the proposed measure lished earlier represent partial estimates. In addition, the number of W aims at discerning the genuine number of people the paper has had citing documents are also obtained for some authors from the Web of an impact upon. In other words, W aims at measuring the manu- Science citation index provided by Thomson Reuters. script’s reach. Implicit to the definition of impact is the most basic The strengths of the proposed impact measure are evident when assumption of citation analysis, that ‘‘references cited by an author applied to eminent scientists who have not published considerably, are a roughly valid indicator of influence on his work’’, in the words but who have nevertheless produced substantial contributions to of McRoberts . Yet, citations are used in a way that attempts the science. For instance, Kary B. Mullis won the Nobel price in chem- mitigation of any deviations from this assumption. istry in 1993 due to his contributions within the field of DNA-chem- The impact of a manuscript W is defined in a way that excludes istry research. From a total of 8063 citing documents obtained from self-citations of any kind. Self-citations, states Schreiber , ‘‘do not the Scopus database over a period of 18 years, it is found that 6435 of reflect the impact of the publication and therefore ideally the self- those records (roughly 80%) have a different first author who do not citations should not be included in any measure which attempts to share authorship with Mullis. Consequently, with a modest h index estimate the visibility or impact of a scientist’s research.’’ Since only of 15, Mullis has had an impact i5 6435 over that period of time. Yet, first authors are taken into account, W establishes a lower bound on as mentioned before, this is a partial estimate and the true impact is the actual number of scientists that are influenced by the article. expected to be much higher. The Web of Science lists a total of 24750 One could argue that an article’s impact should take into consid- citing documents for this author, which is approximately 3 times the eration not only first authors but also the entire authorship of citing number of records used to compute i. To put the impact of Mullis papers, but in that case the measure could easily be over-bloated. into perspective, seven authors taken from those 8063 citing docu- There are fields where an article’s authorship contains even thou- ments were found to have the same h index. Yet, their impact values sands of names, so their inclusion would be not only superfluous but were {346, 404, 553, 555, 561, 680, 1284}, where the highest value is also meaningless. More often than not, first authors write the papers, scarcely 20% of that of Mullis. Another example is the late Richard P. and are responsible to give credit to those who have influenced their Feynman (h 5 37), who won the Nobel price in physics in 1965 for work. his work on quantum electrodynamics. With a total of 8123 citing The computation of impact values for manuscripts may help cope documents obtained from the Scopus database, an impact i5 5175 is with the aforementioned problem of distinguishing those that have assigned to Feynman. In this case only about 64% of the citing docu- influenced the most people in their corresponding field. The impact ments counted towards his impact. As before, the Web of Science lists of a manuscript has a monotonically increasing nature, but its SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 2 www.nature.com/scientificreports Figure 1 | Cumulative impact values for Nobel laureates in physics since 2007. computation over a time period could give an indication of the there is no current way to determine a fair value for a weight of a diffusion of the article in its field. Additionally, it may be expected journal. Impact factors and similar measures should by no means be that the impact follows the same rate of growth as that of the scientific used as weights, as they are not representative of all manuscripts in a 22,1 literature, which has been shown to grow exponentially . Thus, journal. It has been established that usually a very small number of non-contemporaneous manuscripts of the same relative importance publications are the responsible for the high impact factors of some may have very different values of W after t years. Still, this does not journals, whereas the majority of manuscripts receive little cita- 17,23 void its definition, for the newer article would still reach more people. tions . Conversely, the prestige of the journal where the article is Also, even though it is expected that the impact value is mostly published is already implicit in the very definition of impact, insofar obtained from positive citations, the measure carries no information as publishing in a reputable journal may provide higher visibility. As about discrediting citations. And yet, even negative citations have a result, it may be expected that an article published in a prestigious influenced their authors, and thus this lack of information is imma- journal would reach more people, and would therefore have a higher terial to the definition of impact. impact value. The impact for a manuscript bears no information about the repu- Regarding the impact of a scientist i, one immediately remarks that tation of the journal of its publication. Some may argue that the it is not only obtained from articles where the scholar is the lead manuscript impact should be multiplied by a factor that takes this author. The proposed asymmetry stems from the way scientific arti- fact into consideration in order to separate the wheat from the chaff. cles are written. Even though it is the first author who gets the most Yet, in the view of the author this is not necessary, and it may even be credit (except in fields that list authors alphabetically), the writing of harmful. The proposition would not only become elitist, but also a scientific paper is not a solo enterprise, and it usually comprises Figure 2 | Cumulative impact values for Fields medalists since 2006. SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 3 www.nature.com/scientificreports numerous contributions among those in the authorship. Even in the effect on i. If nothing else, these networks would promote the appear- case that it is the first author who writes the manuscript, and who ance of new first authors. Also, it has been revealed that there are carries most (if not all) of the work needed to obtains its results, it is scientists who are eager to review articles whose citations would push not uncommon that the main idea that originates the paper in the towards the increase of their h index . Needless to say that the citation first place comes from someone else in the authorship (usually the machinery, referred to the addition of citations for ‘‘calling the atten- 26,22 student’s advisor or manager of a project). Since it is impossible to tion or gaining the favor of editors, referees or colleagues’’ , cannot discern credit from the authorship, the impact must account equally be avoided (see also ). But this is so for any type of measure based on for all authors. Broad comments on a study where the scientists’ citations. Finally, at worst the impact measure may have major impli- judgements about their contributions in the research team summed cations on researchers that try to forge a career out of non-ethical up to a total of 300%. Besides, failing to give equal credit to all authors work. For example, the plagiarism of articles and its publication in could prompt the researchers of higher-rank to take over first author- little known journals would have little to no effect on the impact of a ship, an unfortunate situation for the student. Defined this way, the researcher . impact can also be applied to those fields that use the Hardy- With respect to the impact of an institution I, departments would Littlewood rule to list authors. In spite of this, impact values in these look into hiring faculty that would explore a different area to those fields are expected to be lower than if authors were not listed in already present. This is already implicit in hiring committees, as it alphabetical order. makes no sense to recruit faculty whose work would overlap com- The proposed measure i establishes a lower bound on the direct pletely with existing professors, and who would therefore reach the impact of a scientist. However, if scientist B cites an idea from sci- same research community. By means of Eqn. (1), however, research entist A, the latter gets no credit by a scientist C who cites B for the institutions have a way of quantifying the added impact of prospect- idea. This is still congruent with the goals of establishing a lower ive faculty. The institutions can also measure their impact overlap, bound, as scientist C never had to be exposed to the work of scientist given by Eqn. (2). The closer this value to zero, the more independent A. Still, the measure could be combined with, e.g., network models the fields of study among the institution’s scientists. Yet, it is sup- or with the modern PageRank citation ranking algorithm to pro- posed that some degree of overlap is not only desirable, but also vide further insight on both direct and indirect impact. But the inevitable, for its direct implication on the collaborations among additional information may come at the expense of losing the the scientists within the institution. impact’s direct interpretation, and its computation would no longer In this day and age, all kinds of measures abound and will live on be simple. whether we like it or not. In spite of that, it is only fair to quantify Another remark that can be made about a scientist’s impact i is scientific output not solely with measures that favor productivity. If that addition is not used to include citing first authors already taken the research of a scientist has a true impact, it should be feasible to into account in previous works, and with valid argument. In the view measure it, even if that research is contained in a single publication. of the author, when addition is used as a means to quantify impact, The suggested methodology to measure impact goes against the pub- the given measure can be abused as it promotes quantity over quality. lish or perish dogma of modern science, putting forward an alterna- Therefore, addition should only be used when measuring productiv- tive ideology to quantify scientific achievement. If taken seriously, ity, which is not the objective of the proposed measure. Although it 7 and scientists do take measures seriously , the proposed measure could be argued that the impact i is unfair to scientists who inspire may help cope with some of the problems caused by the modern the same scientific community with new ideas, there are several structure of science. advantages that more than compensate for this drawback. First, the direct interpretation of the impact i, for it gives a realistic Methods quantity on the number of people influenced by the research of a Impact of a scientific manuscript. In the following definitions, bold and non-bold scientist. Second, an immediate consequence of the first point is that greek letters are used to represent sets, and elements of sets, respectively. The when the proposed measure is applied to eminent scientists, the proposed measure builds from the impact of a single scientific manuscript m, whose impact values can give a rough indication of the sizes of their fields author set is denoted a (m). This set may contain a single author, in which case the set is a singleton. Let P 5 {p , p ,…, p } be a set of m citing articles that do not have an of research. Third, its computation is straightforward as it only 1 2 m S T author in a (m), expressed mathematically by aðÞ p aðÞ m ~1. In other requires information about the authorship of citing papers. Fourth i~1 words, none of the authors of articles in P can be found to be an author of m. and most important, the measure promotes quality over quantity. Furthermore, let Q ; Q(p )g a(p ) represent the lead or first author of the ith citing i i i Authors gain nothing by either dividing their work, or by publishing paper. Thus the set of all lead author scientists that cite m is given byW~ QðÞ p . i~1 the same data over and over, if they are to be cited by the same group Bear in mind that sets do not contain duplicates. The impact of a manuscript is defined as W ; jWj, i.e., the cardinality of set W. Note that W belongs to the set of of researchers. On the contrary, it is likely that a single meaningful natural numbers, i.e., W [ N . work would have more visibility than that of an equivalent series of smaller manuscripts. Longer articles, states Laband , ‘‘receive more Impact of a scientist. Let i~ W denote the combined lead author set over the n 22 j~1 citations than shorter ones’’ (see also Bornmann and Daniel , and manuscripts written by the scientist, whereW refers to the lead author set (as defined references therein). Fifth, the impact encourages innovation.If above) for the jth article. The impact of a scientist i is then defined as the cardinality of this set, i.e., i:jj i [ N . In words, the impact of a scientist comprises the number of all researchers do not increase their impact with consecutive works, it first authors who have cited any of the n published works regardless of the scientist’s means that either their articles are not being cited, or that they are authorship position. Note that the impact of a scientist can be determined by using the cited by the same scientists. On the contrary, a continuous growth of i inclusion-exclusion principle, and that addition is nowhere used. The scientist’s may suggest that more and more scientists get engaged in their work. average impact per article is i/n. In order to increase their impact, scientists may look into exploring other areas within or even outside their field of expertise, encour- Impact of an institution. By extension, the impact of an institution I is determined by the impact of its body of scientists. Given an institution with s scientists, its impact is aging multidisciplinary research and collaborations. Sixth, the thus defined using the inclusion-exclusion principle impact may promote the evaluation of researchers not solely based on productivity. The impact is introduced not to compete with the h I~ i , ð1Þ index or its variants, but to complement them. Seventh, the impact k~1 cannot be manipulated easily towards one’s advantage, as its very definition excludes self-citations of any kind. Furthermore, networks where i denotes the combined lead author set of scientist k. Again, no summation is of researchers that cross-cite their works in order to increase their involved in the definition. The institution’s average impact per researcher is then 15,21 citation count , practice known as cronyism, would have little given by I/s. An institution can measure its impact overlap as SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 4 www.nature.com/scientificreports 14. Lehmann, S., Jackson, A. D. & Lautrup, B. E. Measures for measures. Nature 444, 1003–1004 (2006). V~  i \i : ð2Þ i j i=j 15. Thorne, F. C. The citation index: Another case of spurious validity. Journal of Clinical Psychology 33, 1157–1161 (1977). 16. MacRoberts, M. H. & MacRoberts, B. R. Problems of citation analysis: A critical Algorithm. For the numerical calculations of research impact, it is assumed that there review. Journal of the American Society for Information Science 40, 342–349 are no homographs, i.e., different authors sharing the same written name. This (1989). assumption may result in lower impact values, for those first author scientists who 17. Seglen, P. O. The skewness of science. Journal of the American Society for write their name exactly the same way are counted as one. Yet, some efforts are Information Science 43, 628–638 (1992). underway for the creation of unique researcher identifiers that could easily remove 18,28 18. Lane, J. Let’s make science metrics more scientific. Nature 464, 488–489 (2010). this obstacle . In spite of this, the determination of scientific impact is 19. Martin, B. R. & Irvine, J. Assessing basic research: Some partial indicators of straightforward. Authors who write their names slightly different are detected using scientific progress in radio astronomy. Research Policy 12, 61–90 (1983). the Levenshtein distance, an algorithm that finds dissimilarities between two sequences of characters. When it is recognized that two authors are spelled slightly 20. Martin, B. R. The use of multiple indicators in the assessment of basic research. different, name initials are then compared. If up to this point no assurance about the Scientometrics 36, 343–362 (1996). uniqueness of the author can be made, one of the authors is marked as dubious.For 21. Phelan, T. J. A compendium of issues for citation analysis. Scientometrics 45, the results presented in Figures 1 and 2, dubious authors are but a small percentage of 117–136 (1999). the total number: 1.9% – 3.1% for physicists and 0% – 2.6% for mathematicians. 22. Bornmann, L. & Daniel, H.-D. What do citation counts measure? A review of studies on citing behavior. Journal of Documentation 64, 45–80 (2008). 23. Campbell, P. Escape from the impact factor. Ethics in Science and Environmental 1. Larsen, P. O. & von Ins, M. The rate of growth in scientific publication and the Politics 8, 5–7 (2008). decline in coverage provided by Science Citation index. Scientometrics 84, 24. Page, L., Brin, S., Motwani, R. & Winograd, T. The PageRank citation ranking: 575–603 (2010). Bringing order to the web. Technical Report 1999–66, Stanford InfoLab (1999). 2. Margolis, J. Citation indexing and evaluation of scientific papers. Science 155, 25. Laband, D. N. Is there value-added from the review process in economics?: 1213–1219 (1967). Preliminary evidence from authors. The Quarterly Journal of Economics 105, 3. Broad, W. J. The publishing game: getting more for less. Science 211, 1137–1139 341–352 (1990). (1981). 26. Vinkler, P. A quasi-quantitative citation model. Scientometrics 12, 47–72 (1987). 4. Hirsch, J. E. An index to quantify an individual’s scientific research output. 27. Broad, W. J. Would-be academician pirates papers. Science 208, 1438–1440 Proceedings of the National Academy of Sciences 102, 16569–16572 (2005). (1980). 5. Noorden, R. V. Metrics: A profusion of measures. Nature 465, 864–866 (2010). 28. Editorial. Credit where credit is due. Nature 462, 825 (2009). 6. Hirsch, J. E. An index to quantify an individual’s scientific research output that takes into account the effect of multiple coauthorship. Scientometrics 85, 741–754 (2010). 7. Abbott, A. et al. Metrics: Do metrics matter? Nature 465, 860–862 (2010). Acknowledgements 8. Bornmann, L. & Daniel, H.-D. The state of h index research. Is the h index the ideal The author would like to thank Dr. Vladislav Yastrebov and Prof. Jean-François Molinari way to measure research performance? EMBO Reports 10, 2–6 (2009). 9. Hirsch, J. E. Does the h index have predictive power? Proceedings of the National for their suggestions on the article. Academy of Sciences 104, 19193–19198 (2007). 10. Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: Additional information Toward an objective measure of scientific impact. Proceedings of the National Competing financial interests: The authors declare no competing financial interests. Academy of Sciences 105, 17268–17272 (2008). 11. Bornmann, L., Mutz, R., Hug, S. E. & Daniel, H.-D. A multilevel meta-analysis of License: This work is licensed under a Creative Commons studies reporting correlations between the h index and 37 different h index Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this variants. Journal of Informetrics 5, 346–359 (2011). license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ 12. Schreiber, M. A case study of the Hirsch index for 26 non-prominent physicists. How to cite this article: Aragon, A.M. A measure for the impact of research. Sci. Rep. 3, Annalen der Physik 16, 640–652 (2007). 1649; DOI:10.1038/srep01649 (2013). 13. Bartneck, C. & Kokkelmans, S. Detecting h-index manipulation through self-citation analysis. Scientometrics 87, 85–98 (2011). SCIENTIFIC REPORTS | 3 : 1649 | DOI: 10.1038/srep01649 5

Journal

Scientific ReportsSpringer Journals

Published: Apr 11, 2013

There are no references for this article.