Writing your PhD thesis with bookdown and LaTeX

In this post I describe how I wrote my PhD thesis using the bookdown R package and the memoir LaTeX class. I provide some recommendations and tips, often resulting from painful frustration and endless internet searches. The full source code of my thesis is available here. The final version is available either in HTML or PDF.

Premises

My dissertation is a book written in RMarkdown using the bookdown package. This project is largely inspired by Tristan Mahr’s own dissertation. Importantly, I have used the memoir LaTeX class in combination with the bookdown package. Moreover, this post is largely inspired by previous great tutorials that have been written on the topic, such as this one by Ed Berry or this series of posts by Rosanna van Hespen.

To follow this brief guide, you’ll have to install R, RStudio, as well as the following R packages: knitr, rmarkdown, bookdown (as well as their dependencies). You also have to have a LaTeX suite (such as MikTeX on windows, MacTeX on mac) and a reference manager (such as Zotero).

Throughout the post, I assume that your are familiar with RMarkdown (if you’re not, there are plenty of resources online, such as this book).

How does it work?

All the content of my thesis is written in RMarkdown, but the transition from RMarkdown to the final PDF or HTML version can appear as black magic. As explained by Rosanna van Hespen here:

You might have heard some people talk about pandoc, and you might be wondering what it is. There is actually no need to understand what pandoc does, but if you’re curious: knitr depends on pandoc for its .md to .tex conversion. Basically what happens, is that knitr convert your .Rmd file to a .md (markdown file). That means that nothing changes, apart from the R code chunks, that are rendering and transformed into plain markdown. This includes creating the figures and storing them. Pandoc and pdflatex than come into play to convert the .md file to a .pdf, .doxc or .html file…

Basically, you write everything in RMarkdown, then knitr takes care of converting the .Rmd document to a .md (Markdown) document, then pandoc takes care of the conversion to either a .tex, a .docx, or a .html document, as illustrated in Figure 1.

The role of knitr, pandoc and pdflatex in converting R Markdown to a pdf file (figure from Rosanna van Hespen's blog).

Figure 1: The role of knitr, pandoc and pdflatex in converting R Markdown to a pdf file (figure from Rosanna van Hespen’s blog).

General elements

The index.Rmd file is the central file of the thesis. The YAML content of this file (available here) contains basic metadata such as the author name, the title of the thesis, and so on (in the YAML header). Importantly, it also defines the documenclass (in my case, it uses the memoir package) and the class options. In the YAML header of this file, I also specify the bibliography files (in a better BibTeX format, issued from Zotero). This file also contains the (English) abstract of the thesis, displayed on the welcome page of the html version.

title: Understanding rumination as a form of inner speech
author: "" # defined in _output.yml
date: "" # defined in _output.yml
site: bookdown::bookdown_site
documentclass: memoir # using the memoir package
classoption: a4paper,12pt,twoside,onecolumn,openright,final,oldfontcommands
lot: false # deactivating the default list of tables
lof: false # deactivating the default list of figures
link-citations: yes
bibliography: [bib/thesis.bib, bib/packages.bib]
nocite: '@ito_control_2008'

The _bookdown.yml files defines the name of the outputted document (here “thesis”), the label for the chapters, the outputting directory (i.e., where outputted documents are stored) and the .Rmd files that should be included for each output format. Here is what my _bookdown.yml file contains:

book_filename: "thesis"
chapter_name: "Chapter "
delete_merged_file: true
output_dir: "docs"
new_session: yes

rmd_files: # defines the .Rmd files to be included for each output format
  html: [
  "index.Rmd", "01-chap1.Rmd", "02-chap2.Rmd", "03-chap3.Rmd", "04-chap4.Rmd",
  "05-chap5.Rmd", "06-chap6.Rmd", "07-chap7.Rmd", "08-discussion.Rmd",
  "90-appendix_part.Rmd", "91-appendix_brms.Rmd", "92-appendix_eyetracking.Rmd",
  "99-references.Rmd"
  ]
  latex: [
  "index.Rmd", "00-abstract.Rmd", "00-resume.Rmd", "00-overzicht.Rmd",
  "00-acknowledgements.Rmd", "00-preface.Rmd", "00-toc.Rmd", "01-chap1.Rmd",
  "02-chap2.Rmd", "03-chap3.Rmd", "04-chap4.Rmd", "05-chap5.Rmd", "06-chap6.Rmd",
  "07-chap7.Rmd", "08-discussion.Rmd", "90-appendix_part.Rmd", "91-appendix_brms.Rmd",
  "92-appendix_eyetracking.Rmd", "93-appendix_data.Rmd", "99-references.Rmd"
  ]
  word: [
  "index.Rmd", "01-chap1.Rmd", "02-chap2.Rmd", "03-chap3.Rmd", "04-chap4.Rmd",
  "05-chap5.Rmd", "06-chap6.Rmd", "07-chap7.Rmd", "08-discussion.Rmd",
  "90-appendix_part.Rmd", "91-appendix_brms.Rmd", "92-appendix_eyetracking.Rmd",
  "99-references.Rmd"
  ]

Beyond the _bookdown.yml file, most important elements of the template are listed and discussed below:

  • The preamble.tex (in the ./latex folder) file is like a usual preamble .tex file. It loads the relevant LaTeX packages, defines some commands to be used later in the thesis (such as \initial) and defines some formatting elements. I have tried to comment the code as much as possible but nobody’s perfect.

  • The before_body.tex (in the ./latex folder) defines elements for the cover page (specific to Univ. Grenoble Alpes). The last lines of this file may be more generally useful as they define some formatting elements for the rest of the thesis (e.g., I define the main font and set the line stretch to \OnehalfSpacing).

  • The UGA cover page template is managed by the cover_page.sty style file (in the ./cover folder).

  • Another crucial element, the _output.yml file defines the argument to be passed to the function creating each output. For each output format (here, gitbook, pdf, and word), it defines format-specific arguments. Importantly, for the PDF output, to be able to define a citation style (using a .csl file), the citation_package argument should be set to none so that pandoc-citeproc is used (instead of natbib or biblatex, for instance).

  • A citation style can then be applied by using the pandoc_args argument of the bookdown::pdf_book function.

Some things I have changed from previous versions of the template:

  • I deactivated the default TOC from bookdown and defined a custom one in 00-toc.Rmd to be able to define the order of the 00-*.Rmd files (e.g., abstract, preface, etc).

  • I deactivated the default references manager (i.e., natbib or biblatex) to be able to provide a .csl file for the references (citation_package in _output.yml needs to be none). See the pandoc_args in the _output.yml file.

  • I have manually created a list of abbreviations in the 00-toc.Rmd file and defined a LaTeX command to fill-in this glossary at the bottom of preamble.tex. I have tried to use automatic list of abbreviations such as the one in the glossaries package (https://www.ctan.org/pkg/glossaries) but I did not manage to make it work with bookdown

Conclusions

I hope this post can be useful in some way. Please let me know in the comments if you have issues with the template or if you’d like to add something.

References

Click to expand

Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., Wickham, H., Cheng, J., Chang, W., & Iannone, R. (2019). rmarkdown: Dynamic Documents for R. R package version 1.15, https://github.com/rstudio/rmarkdown.

Berry, E. (2017, September 25). Writing your thesis with bookdown (Web log post). Retrieved from: https://eddjberry.netlify.com/post/writing-your-thesis-with-bookdown/

van Hespen, R. (2016, February 3). Writing your thesis with R Markdown (1) – Getting started (Web log post). Retrieved from: https://rosannavanhespenresearch.wordpress.com/2016/02/03/writing-your-thesis-with-r-markdown-1-getting-started/

Xie, Y. (2019). bookdown: Authoring Books and Technical Documents with R Markdown. R package version 0.13, https://github.com/rstudio/bookdown.

Xie, Y., Allaire, J., & Grolemund, G. (2018). R Markdown: The Definitive Guide. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9781138359338, https://bookdown.org/yihui/rmarkdown.

Avatar
Ladislas Nalborczyk
Associate researcher

My research interests include cognitive science, motor cognition, inner speech, computational and statistical modelling.

comments powered by Disqus

Related