Manubot: a glimpse at the future of paper writing
Scientific writing is a group effort. One person is usually charged with preparing the first draft of the manuscript, but early drafts are often reviewed by other authors, and more complete drafts are reviewed by all authors. This approach is, at least in part, a consequence of the tools that have traditionally been used to write scientific manuscripts.
Open, collaborative scientific writing brings with it a change in how papers are being written. In this post I highlight a relatively new addition to the scientific writing landscape: Manubot.
Manubot is an interesting option when it comes to writing scientific papers. It provides a way to write scientific papers (and other documents) openly and transparently. In its current form, the workflow involves writing the paper in Markdown, which is a lightweight markup language. The neat thing about Manubot is that the paper is hosted on GitHub (although other options are available).
Why use GitHub? It turns out the workflow used to write software lends itself well to writing documents such as scientific papers.
A central, main version of the paper exists, and co-authors create what are called branches of the paper.
Co-authors work on their branches, making changes and additions to the paper. These changes are then proposed to the lead author(s) as part of a pull-request: the changes and additions to be made to the paper accompanied by a message that explains their rationale. At this stage, one or more co-authors typically review the suggested changes and provide feedback. There may be a few rounds of feedback and changes until the pull-request is finally accepted into the main version of the paper.
This process fosters open discussion among co-authors. Discussions on a given topic can also be started by raising a project issue on GitHub. Then, if warranted, an author can start the branch/pull-request process.
What Manubot does is provide various tools that help this workflow to work more smoothly for writing scientific papers.
Key Manubot Features
As mentioned above, Manubot papers are written in Markdown. The use of a simple, plain-text format such as Markdown has several advantages. Authors can use whatever text editor they want to work on their manuscripts. There is even the possibility of writing the paper directly on GitHub and its built-in editor. Also, plain-text makes it easy to use version control software, Git in this case, to track changes.
A key tool that Manubot provides is citation-by-identifier. This means that as we write our paper, we can add one of many types of identifiers and Manubot will retrieve the relevant reference. When our paper is generated, the reference list will be properly and automatically created. Various identifiers can be used, including DOIs, PMID, PMCID, ISBN, URL, arXiv ID. And if Manubot is too complicated for you and your team, you can still use the citation-by-identifier functionality of Manubot as part of a more traditional writing workflow, given that it can be used as a Pandoc filter.
Manubot uses several technologies that are already tied in to GitHub to automate the generation of our paper. Any time a change is committed to the main branch of our paper, it triggers a process that automatically re-generates it, which can be viewed in HTML on a webpage. There are also options to export our paper to other formats such as PDF (or if you must, .docx).
As a project hosted on GitHub, we can track the contribution made by various authors to the manuscript. Thus, at least for the writing process, it is relatively easy to see who contributed what to the paper. This can be a direct contribution to the text that appears in the paper, or in discussions and comments in the pull-request/discussion workflow.
Also, GitHub issues and pull-requests can be used as part of the formal review of our manuscript by a reviewer or editor, for example. Although not common, there are a few open-source journals that have their entire workflow hosted on GitHub; you can’t get more transparent than that!
Finally, given the time-stamped nature of git commits, it becomes relatively simple to demonstrate when certain ideas were formulated. If novelty and priority of discovery are a major concerns, Manubot has a plugin that allows git commits and version of documents to be timestamped using the Bitcoin blockchain.
The creators of Manubot have authored a nice PLOS Computational Biology paper; I highly recommended it for anyone interested in learning more about Manubot.
As might be expected, the authors used Manubot to write their paper on Manubot. The HTML and PDF versions of the paper that are automatically generated by Manubot can be viewed, and the GitHub repository used to write the manuscript using Manubot can be accessed here.
If you know how GitHub repositories work, you can explore the paper’s repository to see some of the workflow described above in action. For example, we can see the content of an initial pull-request:
We can also see the discussion between authors that takes place within the context of a pull-request:
Ready for the mainstream?
As indicated by the authors of Manubot, it remains a tool best suited for computational scientists; those who are already familiar with GitHub, version control, continuous integration tools, etc.
For non-computational teams, a repository for a paper could be set-up by a more experienced user, with other authors more focused on the writing. But, based on its current functionality, Manubot remains a more specialised tool.
While I might not be writing one of my upcoming papers using Manubot, I will definitely be keeping an eye out on this project. It makes it clear that open, collaborative manuscript is possible, and this in a way that makes the science immediately and openly available to anyone interested in reading it.
Scientific publishing is in need of a serious shake up. Open-access journals have had an important impact on the field. And tools like Manubot make it clear that other models are also possible.