How to Automate Blogging (Instead of Writing)

John Salamon

2023-07-23

The older I get the more apparent the value of writing things down becomes. So, this is my big announcement that I will now be writing various things down, instead of just letting them fade from memory. I intend to keep this fairly technical, and in keeping with that, let me tell you about how this blog works. This is useful for me, as it will remind future me of what I spent my Saturday doing. Perhaps you’ll enjoy reading it too?

Converting markdown to html with Pandoc

Pandoc describes itself as a “universal document converter”, and it’s not wrong. You can use Pandoc to convert to and from basically any kind of document. Here, I’m using it to convert markdown to html:

pandoc -s --template ../template.html5 --css=../style.css \
    --include-after <(echo $FOOTER) "$SOURCE" -o $TARGET

The -s flag tells Pandoc that this is a standalone document, meaning it will add some extra html tags and metadata to make a complete web page, rather than just a fragment of html. Pandoc markdown supports passing metadata attributes like title, author, date at the top of the file:

% How to Automate Blogging (Instead of Writing)
% John Salamon
% 2023-07-23

I also set a template, which is minimally modified from default.html5 found in the Pandoc templates repository. Pandoc allows you to pass variables to the template, including for example a footer or header, as well as to specify custom css.

You can actually see this in the markdown file used to generate this page here. Markdown is pretty nice to write in, it’s fairly readable, and easily converted to other formats with Pandoc. It’s also just plain text, meaning it plays nicely with version control systems.

Making a script

Pandoc was the easy part. Next I wanted an index page, as well as a footer on each page that links to the previous/next blog post. I implemented this in bash. I’ve never really learnt to write bash properly, so I doubt it’s great code. Nonetheless, you can check the script out here.

There are two loops. The first loops over each subdirectory, and checks if it contains a markdown file of the same name. If it does, it extracts the date metadata from the file, and uses that as a key for an associative array (the value being the directory path). After the loop completes, I sort the array by date.

The second loop iterates over the associative array, from newest to oldest posts. I simply add a line for each post to the index, and also generate the footer here, before passing it to the Pandoc build command. I also defined an “extras” script here, that if present will run arbitrary commands (if I want to generate a graph using Python, for example, I would put that in extras).

At the top of the file, it is possible to specify options. Currently there are two, -f which will force the script to run, even if the target html exists, and -n, for no extras, in case the extras take too long.

Putting it online

The final step is to actually put this somewhere other people can see it. I recently found out Cloudflare is actually a domain name registrar, and they also have a “Cloudflare Pages” tool for hosting static sites that is currently free. They also host infrastructure that will actually build your site, for free given certain limits. I don’t know how this is economically viable, but I’m not complaining.

As mentioned, it’s easy to keep markdown in version control. This blog is a single git repository, which I then pushed to Gitlab. It’s quite easy to connect Gitlab to Cloudflare Pages, and I now have it set up to automatically sync up and build this blog each time I push there. I was somewhat surprised that the build image Cloudflare provide (which runs on Ubuntu 22.04) doesn’t include Pandoc. But as they let you download basically any dependency as part of your build, I just wrote a script to download the Pandoc tarball before running my make_all.sh script. This seems quite silly to me, but it’s probably still more lightweight than a lot of npm builds would be. I gather that this is the kind of build this service seems to be catering to. Still, it works great for my purposes. I can just edit my markdown, and the site updates automatically after I push.