Markdown for science and academia – options and commands
Markdown is simple to write and many websites, such as Github, can render Markdown files. That is, they can present the content of the Markdown file as intended, not as a simple text file. For example, the introduction (i.e. README.md) document of one of my coding projects looks like this on Github.
But for scientists and academics, the beauty and power of Markdown shines when it is combined with Pandoc. Pandoc gives us the power to control various aspects of how our document is converted and formatted; this is achieved via command-line options when we run the Pandoc program, or options embedded directly into our document using the
Pandoc has lots of command-line options. While these may seem intimidating at first, you will grow to love them.
You can read a description of all the command-line options in the Pandoc manual.
Alternatively, you can type
pandoc --help on the command-line to list all the available options:
pandoc [OPTIONS] [FILES] -f FORMAT, -r FORMAT --from=FORMAT, --read=FORMAT -t FORMAT, -w FORMAT --to=FORMAT, --write=FORMAT -o FILE --output=FILE --data-dir=DIRECTORY -M KEY[:VALUE] --metadata=KEY[:VALUE] --metadata-file=FILE -d FILE --defaults=FILE --file-scope -s --standalone --template=FILE -V KEY[:VALUE] --variable=KEY[:VALUE] --wrap=auto|none|preserve --ascii --toc, --table-of-contents --toc-depth=NUMBER -N --number-sections --number-offset=NUMBERS ...
Let’s have a look at some simple command-line options and how we might use them. To illustrate the effect of these command-line options, we will be working with a sample Markdown file, which is accessible here.
Let’s start by generating a simple PDF document by running the following command:
pandoc notes.md --output=notes.pdf
The above command uses one command-line option,
--output. This option informs Pandoc the name and format of our output file.
Now let’s add a table of contents to our file using the
--toc command-line options. We can also use the
--bibliography command-line option we saw in a [previous post][study notes] to properly format our references:
pandoc notes.md --toc --bibliography ref_list.bib --output=notes.pdf
That was easy and looks nice, but we might want to limit the depth of our table of contents to the first two heading levels. That is, we don’t want
Interview to appear. This is easily achieved using the
--toc-depth command-line option:
pandoc notes.md --toc --toc-depth=2 --bibliography ref_list.bib --output=notes.pdf
Command-line options – passing LaTeX-specific options
When preparing a LaTeX document, we often want to specify options, like the size of the page, whether we want to use one or two columns, etc. These options can be passed to Pandoc using the
--variable command-line option.
For example, let’s add a header on each page, specify we want to use two columns, and set our paper size to A4:
pandoc notes.md \ --toc \ --toc-depth=2 \ --number-sections \ --bibliography ref_list.bib \ --variable pagestyle=headings \ --variable classoption=twocolumn \ --variable papersize=a4paper \ --output=notes.pdf
NOTE: To help with the readability of Pandoc commands, you can put the various command-line options on separate lines by including a
\ after each command-line option.
To provide a final example, we will apply one of the commands that appears in the document we have been generating. Specifically, we will add
--variable fontfamily=arev to our command-line options to generate a document that uses a sans serif font:
Rather than type our options on the command-line, we can include them in our Markdown file itself as a yaml header.
yaml is a file format often used for configuration files. When used as part of our Markdown file, we include it at the very top of our file start and ending with three dashes
Here is what it would look like to include the various command-line options we used in our last example:
--- classoption: - twocolumn pagestyle: - headings papersize: -a4 toc: True toc-depth: 2 bibliography: ref_list.bib number-section: True fontfamily: arev --- # Sans serif fonts Without using another pdf engine (which would require using Markdown's `--pdf-engine` option), there are a few ways to obtain sans serif fonts.
With all these options specified in the Markdown file items, we can run Pandoc as follows:
pandoc notes.md --output=notes.pdf
Another way to work with Markdown/Pandoc is to specify our various options in a dedicated
yaml file. This helps keep our Markdown files clean and allows us to reuse various option combinations (e.g. draft manuscript, study notes, letters, webpage).
So, let’s move the various things in our yaml header to a dedicated
--- variables: classoption: - twocolumn pagestyle: - headings papersize: - a4 fontfamily: arev number-section: True toc: True toc-depth: 2 bibliography: ref_list.bib ...
The few things to note here is that the file starts with
--- and ends with
..., and the LaTeX-specific commands are included as
We just learned how to use various Pandoc (and LaTeX) options to modify how our document is rendered. These options can be passed on the command-line or in a yaml header or in a dedicated yaml file.
And remember, we have been focusing on writing a Markdown file and converting it to PDF (or html) using Pandoc and LaTeX. However, Pandoc can convert from many file formats to many file formats. So the same document could be rendered to
odt if you want to share your document for editing with colleagues who are intimidated by Markdown files (not sure why they would be, they are so straightforward!).