Automating my PhD


When you read “automating” if you envisioned this idea as clicking a button and then never having to bother about your PhD until you are called to collect your degree - sorry to disappoint you but this is not happening (yet).

But yes, over the last few months, I spent a significant amount of time trying to automate repetitive tasks with the hope that it will save me time in the future. The automation has two aspects - first, usage of macOS Automator which makes things easy but is not required for automation as such and second - using shell scripts which is the backbone of all of the stuff I have worked on until now.

My initiation into the world of shell scripting was via a class I took in the January semester of 2017. The course was called “Computational Modelling of Materials” and was taught by Prof. Abhishek Singh.

My objective in taking the course was learning to use Quantum Espresso and understanding a bit of Density Functional Theory. But as a part of the course, the first few hands-on sessions introduced me to shell scripts and using them to generate input files for Quantum Espresso.

Later, when I started searching for automation tools and learning how to get my laptop to do the repetitive tasks for me, the learning from the course came in handy.

The list of the scripts is in the order in which I wrote them along with some background on what situations led me to write them.

The source code of all scripts and a copy of the automator workflows can be found can be found here on GitHub.

Moving Files

This is a fairly simple problem. I downloaded a lot of papers and PDFs which were dumped into my Downloads folder. I wrote a simple Automator workflow that takes selected files and moves them to a specified folder I use for my Papers (or Textbooks and References).

Generating Code Reports

I print out my code and paste them into my lab notebook. Initially, I was writing code only in MATLAB which has a “Publish” feature. This saves a PDF copy of the code with coloured highlights and also adds any plots or results into the PDF to give out one easy-to-print code report.

Problems started when I started writing code in C. Also, in some cases, the MATLAB feature just did not work and got stuck specially if there was a bug. I wanted to print out the code, even if there were bugs so that I could document them and annotate them after debugging for future reference.

For such cases, I wrote a workflow that will take a “*.c” or “.m” file and change the extension to “.md”. Then I enclose the code within “```” marks which makes the markdown see the code as a code block.

The Automator workflow looks something like this:

Then I use pandoc to convert the Markdown to PDF resulting in a beautifully set PDF Code Report ready for print-out. All of this is done using a script

Two Grid and Four Grid Montages

When I started running simulations on MATLAB, one problem was printing out montages of the plots at different time steps. Say I run a problem for 40 time steps. To show the evolution of microstructure with time, I wanted to take four images (say t=0, t=5, t=20 and t=40) and arrange them into a 4 x 4 grid. To do this, I wrote a small script that uses imagemagick to put selected four images into a 4 x 4 layout. A typical grid would look like this:

Of course, what can be done for a 4 x 4 grid can be easily modified for a 2 x 1 or 1 x 2 grid. The scripts can be found on my GitHub page.

Convert DJVU to PDF

Another common task is to convert DJVU files to PDF. Initially, I would just upload the file to some online service but then very soon I was doing the same steps again whenever I had a new file that I had downloaded. So I tried writing a script to do this and called the script from the context menu using OSX Automator.

The script looks for the DJVU files and converts them using DDJVU

ddjvu -format=pdf "$eachfile" "${eachfile/%.djvu/}.pdf"

There are other ways to do so, but I could not get the djvu2pdf to run in the script without throwing errors so I used DDJVU instead.