(This post is co-authored with Ferenc Borondics)
This blog has recently discussed problems of harmful content and its amplification by recommendation engines. Here, we examine a very different class of potentially harmful content in academic literature. Credibility is one of the most important features of scientific publishing. It is achieved by several means: the professional recognition of the authors through their personal work, the reputation of their institutes, and, last but not least, the prestige of the scientific journal and its peer review process.
Unfortunately, peer review is often (extremely) slow with multiple reviews, answers, arguments, counter-arguments, corrections and so on. It often takes many months to get a paper accepted. However, especially when a field is very hot, cutting edge results come from multiple labs and, understandably and rightfully, the teams would like to have their deserved recognition for being the first to report an important discovery. Therefore, the incentive is to publish results on a platform that documents the contribution before the peer review is finished. This can be done through preprint servers that exist for many scientific fields. Unfortunately, just like most tools, they can be misused. For example, as proxies to disseminate low grade scientific literature. An article in Nature highlighted this danger almost two years ago, and preprints are growing fast. Around that time, in one database, preprints had been growing ten times faster than journal articles.
Recently, we came across this paper, with a conclusion asserting that they “can advise Vitamin D supplementation to protect against SARS-CoV2 infection”. The paper looks quite legitimate, the authors have university affiliations, and the style and format appear scientific. It’s not immediately obvious from the page whether or not the paper has even been submitted for peer review or not. And this preprint paper is not insignificant: as of this writing, it had over 90 000 views. Let’s take a look at what it says.
The authors looked at country-level correlations between mean vitamin D levels and COVID-19 outcomes to find that countries with lower mean vitamin D levels tend to have worse COVID-19 rates. This is the same method as was used to show that eating more chocolate increases your chance for the Nobel price 1. It is a textbook example of the ecological fallacy in which “inferences about the nature of individuals are deduced from inferences about the group to which those individuals belong” and might lead to deducing causality from correlation. There are all sorts of possible alternative explanations for a correlation between a country’s mean vitamin D level and COVID-19 infection rate: for example, countries with better-developed health-care systems might tend to take better care of both vitamin D levels and COVID-19 infections. This is an example of the classic statistical adage, “correlation is not causation”.
The paper is interesting and the findings might even warrant further investigation, but the conclusion, “We believe that we can advise Vitamin D supplementation to protect against SARS-CoV2 infection.” is irresponsible. While vitamin D supplementation (with reasonable dosage) is unlikely to cause harm, it’s entirely possible that those following the recommendations of the paper may think that taking extra vitamin D means they can be less careful about hygiene or social distancing; there is significant literature about the harms of ineffective medical treatments. This research has already been picked up by the popular media. Although the article includes an appropriate disclaimer, it’s unlikely that such a warning is adequate to prevent the spread of misinformation.
With this post we wanted to bring attention to a possible interaction between preprint systems, popular and social media, and readers that all contribute to the spread of harmful misinformation wrapped in the cloak of scientific credibility. As we said in the opening paragraphs, tools can be and are misused. In fact, preprint servers are wonderful systems enabling quick and structured dissemination of research results without the lengthy process of peer review. They are the best open access routes for scientific publication in contrast with those journals that simply shift the publication cost from the reader to the author. The maintainers and funders of preprint systems deserve praise for their efforts in furthering the world of free knowledge.
- Although the chocolate and Nobel prize paper somehow survived peer review, showing that it is not a perfect filter either.