|Year : 2020 | Volume
| Issue : 4 | Page : 295-297
Jargon and metrics for evaluation: Are they valid or instead promote questionable practices?
Department of Community Medicine, Dr. DY Patil Medical College, Hospital and Research Centre, Dr. DY Patil Vidyapeeth, Pune, Maharashtra, India
|Date of Submission||05-Feb-2020|
|Date of Decision||15-Mar-2020|
|Date of Acceptance||26-Mar-2020|
|Date of Web Publication||20-Jul-2020|
Department of Community Medicine, Dr. DY Patil Medical College, Hospital and Research Centre, Dr. DY Patil Vidyapeeth, Pune, Maharashtra
Source of Support: None, Conflict of Interest: None
|How to cite this article:|
Banerjee A. Jargon and metrics for evaluation: Are they valid or instead promote questionable practices?. Med J DY Patil Vidyapeeth 2020;13:295-7
|How to cite this URL:|
Banerjee A. Jargon and metrics for evaluation: Are they valid or instead promote questionable practices?. Med J DY Patil Vidyapeeth [serial online] 2020 [cited 2020 Sep 27];13:295-7. Available from: http://www.mjdrdypv.org/text.asp?2020/13/4/295/290178
A recent feature in India's leading national daily, The Times of India (TOI), reported that India's “best” panchayats (form of local self-government for villages), were the worst! They “scored” full marks on several parameters on an official survey to monitor the implementation of government programs related to agriculture, roads, schools, housing, poverty, drinking water, electricity, etc., On ground, the reality was different.
Molugampoondi gram panchayat topped the national rankings by scoring full marks for roads, education, health, and sanitation. Visit by journalist from TOI revealed garbage-strewn and potholed paths. Public toilets were unusable and covered in vegetation. A local resident expressed surprise that his panchayat had topped the nationwide rankings given the poor state of roads and other infrastructures in his village.
Be it evaluation of international development work or evaluation of academic faculty and institutions modern tools based on jargon and metrics while ostensibly objective may raise concerns about validity. Quantity may get priority, whereas quality may be overlooked. Metrics which were generated by counting number of roads, schools, health centers and public toilets may not reveal the ground realities in development work. The local people will know whether the roads are usable, whether the schools are functioning, whether the health centers are well equipped, and whether public toilets are well maintained and functioning. Jargon such as “outcome” and “impact” in development work may also convey different meanings in different contexts and can be misleading when the metrics are flawed and lack clarity.
In the same manner, appraisal of research of academic faculty and institutions using jargon and metrics can be full of perils and pitfalls. Prior to advent of jargon and metrics, deviation from research integrity, driven by the “publish or perish” culture, confined itself to three categories of misconduct, i.e. fabrication, falsification, and plagiarism. These are the traditional faces of research fraud. However, jargon like “impact” and various “metrics” such as journal impact factor (JIF), H-Index, and I-10 index promote various “questionable research practices,” which may be lesser evils but being more widespread and prevalent have the potential to undermine the quality and people's trust in scientific publications. With higher stakes, “publish or perish” is giving way to “impact or perish.”
For example, one of the most misused metrics is the JIF. This metric was designed to help librarians choose the most popular journals based on citations and not to measure the worth of individual articles or researchers. To quote Sir Mark Walport, “. impact factors are a rather lazy surrogate. We all know that papers are published in very best journals that are never cited by anyone ever again. Equally, papers are published in journals that are viewed as less prestigious, which have a very large impact. We would also argue that there is no substitute for reading the publication and finding out what it says, rather than either reading the title of the paper or the title of the journal.” In spite of such concerns, JIFs continue to be used to appraise the research output of academic faculty and institutions in many countries; in some publication in a journal with an impact factor <5 is not counted at all.
This misuse of the JIF to assess faculty has converted journals to mints and research papers to currency. Moreover, it has distorted the meaning of “impact.” Impact should convey an “effect” which should come after the cause, i.e. the citations generated after publication of a paper. The JIF on the other hand confers valuation to the paper before it gathers any citations, if any, or even before anyone reads that particular paper, if ever. Although JIF is designed to evaluate journals and not individual papers for all practical purposes, it has come to represent the face value of the paper. All academic institutions will recognize this face value irrespective of the actual worth of the paper. Papers in journals with high JIF are like money in the bank.
The concept of publications as currency has, in some instances, spawned a black market for scientific papers. In this bazaar, gift authorships are passé, replaced by purchased authorships. Other questionable practices include rigged peer reviews. The authors have creating fake email addresses posing as external experts to review their own papers. Retraction watch, a watchdog blog on publication ethics, has reported more than 600 cases of rigged peer review.
Besides these blatant breaches of publication ethics, the “impact or perish” ethos has precipitated other types of misdemeanors. These include promoting journal self-citation or forming citation cartels between chains of journals.
Lesser breaches of integrity at the level of the individual researcher are much more common and collectively affect the validity and trust in scientific research. These include cherry picking in reporting, P-hacking by data dredging, and hypothesizing postergo propter hoc or after the results are known instead of a priori. All these measures are directed to make the study more interesting and statistically significant to improve its chances of being accepted by a journal with high impact factor ensuring tenure and promotion for the researcher.
But should all the blame for questionable research practices fall on the individual researcher? If one examines the academic environment, the “web of causation” for such practices becomes evident. University rankings, both at the national and global levels, count the faculty's publications and citation counts. Increasingly, universities are hiring consultants in an attempt to improve these metrics. Higher rankings draw more grants, prestige, students, more alumni donations, and revenue potential. With such high stakes, the pressure is transmitted to faculty to “publish or perish” or more ambitiously, “impact or perish.” Academic institutions under compulsion drive their faculty to increase research “outputs” to enhance ratings by accreditation bodies. A vicious cycle has set in due to advent of jargon and metrics to rate faculty and institutions.
This is the lay of the land. The present trend of jargon and metrics (shall we call it “jargonometrics?”) to appraise professional worth is changing perspectives. “Concentrate on your research. If your research is good, no one will care if you can teach. After all, when was the last time someone got tenure for being a good teacher?” It is common for young faculty starting their career getting this advice from their mentors.
These developments may precipitate a paradox. The highest rated faculty and the highest rated institutions may turn out to be the worst similar to the paradox precipitated by rating villages by quantitative metrics on various parameters. The proof of the pudding is in the eating. The real quality of villages will be known to the inhabitants and cannot be captured by quantitative metrics. In the same way, students can identify the teachers who ignite and inspire and those who are remedies for insomnia.
Problems of evaluation exist both in war and peace. As mentioned in a book of modern warfare, “No combat-ready unit has ever passed inspection: No inspection-ready unit has ever passed combat.”
Academic institutions undergo many inspections for various forms of accreditations and for ranking at national and international levels. Consequently, pressure to publish may be felt by academic faculty. We hope that obsession with “jargonometrics” to ensure inspection-readiness do not detract from the prime purpose of these institutions which is to teach and inspire young people.
| References|| |
Belcher B, Palenberg M. Outcomes and impacts of development interventions: Towards conceptual clarity. Am J Eval 2018;39:4:478-95.
Hatch A. To fix research assessment, swap slogans for definitions. Nat 2019;576:9.
Biagioli M, Lippman A. Introduction: Metrics and the new ecologies of academic misconduct. In: Biagioli M, Lippman A, editors. Gaming the Metrics. Misconduct and Manipulation in Academic Research. Cambridge: The MIT Press; 2020. p. 1-23.
House of Commons Science and Technology Committee. Peer Review in Scientific Publications. Eighth Report of Session 2010-12. HC 856. London: Authority of House of Commons, the Stationery Office Limited; 2011.
Alberts B. Impact factor distortions. Science 2013;340:787.
Hvistendahl M. China's publication bazaar. Science 2013;342:1035-9.
Dunnigan J. The primary law of warfare: Murphy's. In: How to make War? A Comprehensive Guide to Modern Warfare for the Post Cold War Era. 3rd
ed. New York: William Morrow and Company Inc.; 1993. p. 338-49.