Eric J Ma's Website

Tooling up for Plain Text Academic Writing in Markdown

written by Eric J. Ma on 2016-06-22 | tags: academia writing grad school


I finally got down to doing it! Here's my brief notes on how I set it up; it may change a bit later on, as my needs evolve, but it should be enough instructions for others to get setup.

Motivation

I'm writing a white paper on genomic surveillance, so I thought I'd experiment with something new. By switching over to writing in plain text, I can ignore all distractions by other word processors. I'm looking at you, Word, Google Docs and others.

Because I get to write in plain text, I also get to version control the text. By selectively committing logical blocks, I can also easily selectively revert unwanted commits; it's rare, but once in a while it's saved my work.

Because it's plain text, I can export to anywhere with a bit of configuration.

Because it's written publicly, it's great.

Overview

What I do here is basically use Sublime Text 3 (ST3) as a word processor. BibTex libraries are used for keeping track of papers, but I like Papers for organizing more than BibDesk - automatic downloading of PDFs is the best part. My compromise, then, is to export my Papers library to a master BibTex library prior to compiling everything into a single PDF. Papers can also add in Pandoc-style citations, which Pandoc can then parse to make an automatic bibliography. Everything is kept under version control, for reasons stated above.

It basically took me one morning's worth of time invested in tooling this up, but I can already imagine the amount of time saved by minimizing the time spent on formatting references, figure numbers, versioning, and (most of all) being able to write without being distracted.

Tools

  1. Papers for Mac (paid)
  2. Pandoc (free)
  3. Sublime Text (free/paid) - you can use it for free, but I decided to fess up and pay for it, the text editor is that good, that powerful, that I would encourage you to do the same too.
  4. Git and GitHub (free)

Setup

Central Bibliography

  1. Set up a version controlled GitHub repository for the master bibliography file.
  2. In Papers, export the library to that repository's directory. Take note of the /path/to/master/library.bib.

General Tooling

  1. Install Pandoc, either using homebrew, or download a fresh copy of the installer binary. Note the /path/to/pandoc.
  2. Install the LaTeX distribution, make sure that pdflatex is bundled. Note the /path/to/pdflatex.
  3. Fork the Citation Styles repository, and clone it to disk.
  4. Install the pandoc-fignos plugin, to enable automatic figure numbering. Again, take note of the /path/to/pandoc-fignos.

Sublime Text Tooling

Install:

  1. Package Control
  2. Pandoc
  3. CiteBibTex
  4. AcademicMarkdown
  5. Git
  6. GitGutter
  7. PackageSync
  8. BracketHighlighter
  9. WordCount

Configure:

Pandoc

    "pandoc-path": "/path/to/pandoc",
      "PDF": {
        "scope": {
          "text.html": "html",
          "text.html.markdown": "markdown"
        },
        "pandoc-arguments": [
          "-t", "pdf",
          "--latex-engine=/path/to/pdflatex",
          "-o", "/path/to/output.pdf",
          "--filter", "/path/to/pandoc-fignos",
          "--filter=/path/to/pandoc-citeproc",
          "--bibliography=/path/to/master/library.bib",
        ]
      }

Apart from placing these pandoc-arguments under the PDF section, you may want to do the same for the HTML and Word sections.

CiteBibTex

Find the corresponding configuration fields, and change them to the following (making appropriate changes):

    "bibtex_file": "/path/to/master/library.bib",
    "autodetect_citation_style": true,

User Interface

Today I learned that ST3 has a "Distraction Free Writing Mode", under the View menu. Earlier on, I also learned that it has a pane view mode, also under View -->Layout. Both have associated shortcut keys. My writing interface ended up looking something like what's in Figure {@fig:two-pane}.

Two pane view.{#fig:two-pane}

My outline is on the right, and the main text is on the left, and there's no distracting tabs, sliding preview, or directory structure (as is what I'm used to for coding).

Writing

Get started by adding the YAML headers in the document (Figure {@fig:yaml-header}).

YAML Headers.{#fig:yaml-header}

Specifically, the format of what I have above is:

---
title: "My Title Here"
author:
- "Author 1 (Affiliation)"
- "Author 2 (Affiliation)"
date: 22 June 2016
csl: nature.csl
---

More details on what metadata can be stored in the headers can be found on the Pandoc README.

Citations

Citations are done in Markdown by inserting:

   [@citekey]

where the citekey is automatically generated by Papers, usually in the form of LastName:YYYY[2- or 3-letter hash]. An example of what gets inserted is [@Young:2013px]. I was reading through Papers' documentation on the generation of a "universal" citekey, and I quite like the idea. I think the specification is worth a read, and is an idea worth spreading (sorry, I like TED talks).

I intentionally configured my ST3 Pandoc package settings to use a global master library, rather than a project-specific one. I think it stemmed more from laziness than anything else; one less thing to manage is a better thing.

Generating Outputs

Note that the way I had configured Pandoc (above) for PDF outputs was to use the master .bib library for matching up references. An additional thing I did was to keep a copy of the citation style language (CSL) markup file in the same directory as the Markdown document.

Within ST3, we can use the Command Palette to quickly generate the PDF output desired. Steps are:

Select Pandoc

Command Palette: Select Pandoc.

Select PDF as the output

Command Palette: Select PDF.

Inspect that gorgeous PDF!

Look at this gorgeous output!

Check your references!

Look at those perfectly Nature-formatted references!

Just to show how the figures get numbered correctly (I don't have any in the draft whitepaper I'm writing), you can inspect the source code for this blog post, and the associated pdf file. Note how I've not numbered anything except the associated files. It's pretty cool.

Alrighty - and that's it! Hope it helps others too.


I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.

If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!

Finally, I do free 30-minute GenAI strategy calls for organizations who are seeking guidance on how to best leverage this technology. Consider booking a call on Calendly if you're interested!