Formalizing the definition of reproducibility and replicability
As highlighted in previous posts (e.g., 1, 2, 3), reproducibility and replicability are key features of scientific studies that have received considerable attention in the popular press and the scientific literature. However, as highlighted in a recent paper by Patil, Peng & Leek, there is no consensus on what these terms mean. Therefore, these authors took on the task of trying to formalize these terms.
The problem: overlapping and conflicting definition
In their article, Patil et al. (2016) explain that a major initiative in psychology defines
reproducibility as conducting experiments again, including data collection, whereas in cancer biology
reproducibility refers to recalculating results using the same data and code. Similarly, in human genetics
replication is often used to refer to a pair of independent studies producing the same result with similar levels of statistical significance.
Replication has also been used to refer to redoing experiments, as well as recreating results using the same data and code.
A statistical model of reproducible and replicable science
There have been attempts for formalize the meaning and scope of these terms, but as pointed out by Patil et al., these attempts have not provided a statistical model of reproducibility and replication. In their paper and the accompanying supplement, Patil et al. develop a statistical model of the scientific process, which includes statistical definitions for all its key aspects. These terms are listed on the left-side of Figure 1. The authors argue that when all of these aspects of a study are not known, it is not possible to properly interpret the results. Importantly, these aspects should also be part of studies investigating reproducibility and replication.
Interpreting studies on reproducibility and replication
When Patil et al. considered papers that have influenced scientific and public opinion on the reproducibility of science, they found that some publications were missing at least some aspects of a scientific study, whereas these were incompletely or erroneously reported in other publications. The summary of their findings is shown in Figure 1:
The bottom line is, whether you are studying the effectiveness of a new drug, the social effects of disability or the reproducibility of scientific claims, the same criteria should applied to judge the quality of all published study. And without knowledge of all key aspects of the scientific process, it is not possible to properly assess the quality of the science.