Causes of poor reproducibility in biomedical research
In a previous post, I highlighted a symposium that was held to improve the reproducibility of biomedical research. The published report includes a description of the causes and factors associated with poor reproducibility; these are summarized below.
Key causes and factors linked to poor reproducibility
False discovery rate and small sample sizes. The false discovery rate is the expected proportion of false positives in a set of significant results, and it depends on the proportion of hypotheses tested that are actually true. For example, if 10% of tested hypotheses in a field are actually true and all studies have a statistical power of 80%, more than one third of significant results (36%) will in fact be false positives.
The statistical power of a study depends on its sample size. Unfortunately, many research disciplines are plagued by small sample sizes (see this post for a discussion), and this greatly increases the false discovery rate. If the statistical power of the studies from our previous example was 20%, the false discovery rate would soar to 69%! That is, more than two thirds of statistically significant results would in fact be false positives.
Researchers could address this problem by adopting a more stringent level of significance (e.g., p=0.001).
Small effect sizes. In many fields, a large proportion of the most easily observed phenomena have already been discovered. This means that researchers are investigating increasingly subtle effects, which are more difficult to detect. Because the size of an initially discovered effect will decline with repeated testing, power calculations – which depend on the size of an effect and the chosen level of statistical power – will result in erroneously small sample sizes.
Exploratory analyses. Researchers are under pressure to publish, and positive results that tell a story are more likely to get published than negative, null or inconclusive results. The motivation to find a significant result can lead researchers to explore their data until positive results are found. This type of flexible data exploration increases the likelihood that significant findings are, in fact, spurious. Exploratory analyses should be presented as such. To present this type of analysis as if it were hypothesis driven is misleading and contributes to the bias towards positive results in the published literature. While Hypothesising After Results are Known (HARKing) is unscientific, it is almost impossible to identify once the results have been published.
Flexible study design. Many researchers use flexible study designs that increase the likelihood that positive results are found and therefore published. However, greater flexibility in study designs means that research results are less likely to be true. Researchers should be clear about the study aims and analyses before a study is carried out and these should be accurately reported in publications. Selective analysis and result reporting can give researchers a false sense of confidence that what they have discovered is a true effect.
Conflicts of interests and introduction of bias. Funding sources (e.g., drug companies) are a potential source of bias and it important that they be transparently reported in studies. However, non-financial factors, such as commitment to a scientific belief or career progression, can also introduce bias. Thus researchers should be aware of potential biases when designing and analysing their studies to ensure appropriate controls are in place. This is harder than it seems because biases are often unconscious and thus difficulty to identify.
High-profile scientific fields. Scientists are incentivised to generate novel findings and publish in journals with high-impact factors. However, there is evidence that results published in these journals are prone to overestimate the true effect of research findings. As it stands, there is little incentive for authors to publish negative results, and they may feel it is not worth devoting time and effort to publish such findings in low ranking journals. This mentality is partly to blame for the bias towards publishing positive results, and this incentive structure creates an environment where questionable research practices will continue to thrive.
From the outside, these causes and factors are obvious. However, they are difficult to identify in one’s own research activities, especially when the novelty and quantity of one’s research output directly impacts research funds, salaries and career advancement. Importantly, scientists are currently incentivised to not address these problems; to do so would harm one’s career.