Errors in science: I make them, do you? Part 2
Science and computers go hand in hand. But so do humans and errors.
That is why in my previous post I talked about my experience of how science, a human endeavour, does not always apply due diligence when it comes to ensuring errors are avoided. I don’t think this is intentional, but rather a mix of ignorance, bias, and lack of proper training.
Continuing on this exploration of errors in science, I again delve into my personal experience and a few things I have seen along the way.
Things I have seen along the way
As a scientist, you don’t often get to poke your nose into the data and statistical analysis practices of your collaborators, colleagues and research community. Similarly, others do not often get to see the chaos that was/is my data analysis pipeline. I realise there are researchers and research groups that are open and transparent about such things, but I have yet to experience this myself.
Often times, there is an implicit level of trust. Trust that your collaborator knows what he is doing, and that he is as careful as you are. Given the pressure to publish, this trust also reduces ones workload. By not inspecting and verifying the work of our collaborators, we have more time to focus on our own work.
I often see this type of trust applied by supervisors to their graduate students and post-doctoral researchers. As long as the results that are presented are sensible, it seems many supervisors see no reason to look more deeply. This somewhat hands-off approach to research is worrisome.
I have often gone along with this status quo, not asking my collaborators to show me how they stored and analysed the data. However, there have been a few instances where I asked to see the data…and that is when things got ugly!
Because of the implicit level trust that is afforded to students and collaborators, it seems that my formal request to see the data was taken to mean that I did not trust my collaborators.
In one case, a junior scientist repeatedly refused to share the data with me. I had to make several demands to the senior scientist to finally be able to see the data, data that I helped collect.
In this and another case that I remember vividly, a quick scan of the spreadsheets revealed dozens of errors. Was I surprised? Not really. Not because I expected these particular people to make errors, but because people in general make errors: to err is human. However, if I did not push to see the data, these errors would have likely made their way into the published literature, with no one being the wiser. Moreover, if the spreadsheet was not made available with the manuscript, it would be almost impossible to find these errors.
Unfortunately, we are not very good at finding our own mistakes, and we grossly underestimate the number of mistakes we make.
This is why scientists need to apply strategies to identify and correct such errors. We need to be humble enough to know errors are possible, and be thankful when someone offers to go over our work. It is much better to catch errors before they make it into a published manuscript!
While there is a move towards sound science, most scientists still feel that publishing (lots of) high impact papers is the key criteria used to assess their performance. With such a system in place, there is little incentive to slow down and check every line of code, or every cell in a spreadsheet.
But, as Bob Dylan so eloquently put it: “the times, they are a changing”. The innate trust that has been afforded to scientists must be revised and replaced with healthy skepticism.
As open science practices becomes more widespread, so to will our humility. We will learn to embrace the scrutiny because it will lead to a greater level of confidence in our results. But it is not easy to have errors pointed out.
This is why it is important that we, as a community, make this transition with a light touch.
The final post in this series will look at the distinction between scientists who do code and those who do not code.