Why Human Judgment Must Be Primary Over Metrics (Academic Edition)

Created
Thu, 17/08/2023 - 15:04
Updated
Thu, 17/08/2023 - 15:04
Why Human Judgment Must Be Primary Over Metrics (Academic Edition)

So, you’re all probably aware of the replication and fraud crisis in the scientific community. Psychology’s been hit hardest, and the social sciences, but the physical sciences have not been immune.

Retractions have risen sharply in recent years for two main reasons: first, sleuthing, largely by volunteers who comb academic literature for anomalies, and, second, major publishers’ (belated) recognition that their business models have made them susceptible to paper mills – scientific chop shops that sell everything from authorships to entire manuscripts to researchers who need to publish lest they perish.

These researchers are required – sometimes in stark terms – to publish papers in order to earn and keep jobs or to be promoted. The governments of some countries have even offered cash bonuses for publishing in certain journals. Any surprise, then, that some scientists cheat? (my emphasis)

And these are not merely academic matters. Particularly when it comes to medical research, fakery hurts real people. Take the example of Joachim Boldt – the German anesthesiologist who, with 186 retractions, now sits atop the Retraction Watch leader board of scientists with the most pulled papers.

The key paragraph is #2: academics are judged on how many papers they have, and how many citations those papers receive. Getting hired and getting tenure are based on them. Since it’s hard to get a full time real academic job these days, let alone get tenure, there’s a LOT at stake for academics. Publish or perish.

This isn’t how such decisions were always made, however. At one point, human judgment was given a much bigger sway. Hiring committees read the research, looked at teaching, and talked to the academic. Some academics published only a few papers, but they were good papers, and others were considered to have potential.

Such a system was subject to standard human abuse: hiring people who were liked, in effect, so an independent measure of academic excellent was sought, and what was come up with citations: if your research was important, presumably other academics would refer to it.

But any metric which is used to make monetary decisions is quickly gamed. If you must have those citations, many people will cut corners to get them. After spending 10 years to earn a Ph.D. the idea of being part of the large majority who either get no job or become associate profs, badly paid and treated, isn’t palatable.

For a long time this went on and cutting corners worked: the people inside the system were those who had benefited from it, after all. Everyone knew it was occurring but the incentives to prove it were lacking. Then some outsiders started looking, people funded with outside money, and they found a ton of fraud and sloppiness.

We keep doing this: we keep seeking metrics to cut out human judgment, but it can’t be done. It’s not that metrics aren’t useful, but, again, as soon as everyone knows what the metrics are, they game them. (Note how similar this is to Google’s early metric: how many links a webpage received. Remember how good early Google was before everyone started search engine optimization and Google decided to maximize monetization.)

The solution isn’t to find new metrics, and to get back on the treadmill, it is to go back to judgment, and to review the results over time with groups of outsiders and insiders.

You can’t outsource human decisions on who gets power to algorithms. It never works and it never will, as we’re finding out with “AI”.

Just bit the bullet and take responsibility.


This is a donor supported site, so if you value the writing, please DONATE or SUBSCRIBE.