written by Eric J. Ma on 2017-03-28 | tags: thesis academia grad school
Some short thoughts on how I used "continual publishing" to enable a single-source, multi-output, continually-updated thesis.
I've finally turned in a polished draft of my thesis (HTML or PDF) to my committee! My thesis topic is on the development of an algorithm to identify reassortant influenza viruses from large sequence databases, and its application to the study of influenza's evolution and ecology.
Well, actually, it was last week when I finished it, but I've been doing the job hunt the past week that I've delayed on writing this blog post.
Apart from the written summary of the work that I've been doing, I wanted to simultaneously write for PDFs and for the web, so I started assembling a software toolchain that compiles my raw markdown files, converts figures from PDF to JPG, and simultaneously builds the PDF and the HTML versions. A lot of Python packages, including csv2md
, the pandoc-xnos series, and non-Python tools, including ImageMagick (https://www.imagemagick.org/script/index.php).
Yes, I know I could have done most of this with Authorea, but being me, building things and doing reverse engineering is also kind of fun! (Especially for learning purposes.)
I hope you enjoy my thesis!
@article{
ericmjl-2017-thesis-thesis,
author = {Eric J. Ma},
title = {Thesis},
year = {2017},
month = {03},
day = {28},
howpublished = {\url{https://ericmjl.github.io}},
journal = {Eric J. Ma's Blog},
url = {https://ericmjl.github.io/blog/2017/3/28/thesis},
}
I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.
If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!
Finally, I do free 30-minute GenAI strategy calls for teams that are looking to leverage GenAI for maximum impact. Consider booking a call on Calendly if you're interested!