Simple rules for making research software more robust

As a new programmer, it still takes me a while to master a language and implement good coding practices. Novice and experienced programmers recognise the difficulties in learning to write code that is generalisable enough to be used in future projects, and by other users. To address these difficulties, Morgan Taschuk and Greg Wilson offer some simple tips to make research software more robust. Specifically, they suggest ways to write code that (1) is relatively easy to install on more than one computer, (2) does what the programmer says it does and (3) can be integrated with other software tools. Here is a short summary of some of their points:

1. Use version control

At Scientifically Sound, I know we say this a lot but without version control there really is no other way for you (ie. your future self) or others who use your code to keep track of changes to the code, what the code does, and what versions were released. Think of version control as being similar to Dropbox with the added benefit that you can see the difference between version changes and synchronise files between many programmers and machines. Check out our tutorials on Git for version control and its web-based repository GitHub.

2. Document your code and usage

There are plenty examples of how NOT to write redundant code documentation, somewhat like this. The key idea is to write good documentation in the README file. The documentation should explain what the software does, list dependencies, provide installation instructions, and give a few example commands to get a user started.

3. Make common operations easy to control

Write a program so that the parameters that change the most often can be configured from the command line. To Windows and Mac users, it might seem a little strange to put extra work into writing programs with command line functionality. I started to appreciate the benefits of this when testing small programs written by others and tweaking different parameters at the command line. If nothing else, it helped me think about how to write more generalisable code for future projects that tend to require the same functions.

4. Version your releases

For the slightly more advanced programmers, release your software with a version stamp so the same version can be retrieved in future. This way, software published with a paper can be used to reproduce the results, and a particular software version executed with the same parameters should give identical results no matter when it is run.

5. Reuse software (within limits)

The whole point of coding is you don’t have to keep writing the same thing over and over again. Programmers often want to reuse software written by others. However, reusing software (eg. existing software libraries or separate executables) requires code dependencies, which may affect how easily code can be reused in future when dependencies change. The authors suggest these guidelines: (1) make sure you really need the dependencies, (2) if launching an executable, ensure the appropriate software and version is available and (3) ensure the software that is being reused is robust itself.

Reference

Taschuk M and Wilson G (2017) Ten simple rules for making research software more robust. PLoS Computational Biology 13(4):e1005412.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s