Dockerised Hugo for Local Development

Following on from last night’s post, I needed a way to run Hugo to build the new entry and deploy it. Since I had to rebuild my environment from scratch I wanted to see if I could run Hugo and Go without installing them locally.

I know Go is unlikely to cause any stability issues, as it installs all its dependencies in the user’s home dir, rather than touching system files but I’m determined in my experiment to keep my new install as clean as possible.

Using some insight I’d gathered from using docker-tizonia a Docker version of Tizonia and using asolera’s Golang minimal Dockerfile image as a base, I was able to put together a minimal Dockerfile that does the following:

  1. Creates a golang based build image to pull down the latest version of Hugo.
  2. Build and install the Hugo binary
  3. Copy the binary to a clean image
  4. Set the image work directory to /site
  5. Expose the Hugo server port 1313
  6. Make Hugo the entry point and default to the help text if I forget to add a command.

The Dockerfile looks something like the following:

FROM golang:1.14.3-alpine3.11 AS build

RUN apk add --no-cache git

ARG HUGO_BUILD_TAGS

RUN go get -v github.com/gohugoio/hugo/source
WORKDIR /go/src/github.com/gohugoio/hugo

RUN go install

RUN apk del git

FROM alpine:3.11

COPY --from=build /go/bin/hugo /usr/bin/hugo

RUN mkdir /site
WORKDIR /site

# Expose port for live server
EXPOSE 1313

ENTRYPOINT ["hugo"]
CMD ["--help"]

Also thanks to jojomi of bits cribbed from their Hugo Dockerfile.

Many of the Hugo Dockerfiles I found would copy the website source to the container in preparation of serving the files from Docker. In my case I’m happy with my plain HTML to continue being served where it is, but didn’t want to lose out on the features you get when you’re using Hugo to develop locally - such as running a test server with live reloading.

With the help of a handy “hugo” wrapper shell script, I was able to fire up Hugo in the container, and serve my local files through a mapped volume with no appreciable difference to how Hugo was running for me before.

The wrapper is as follows:

#!/bin/bash

docker run -it --rm \
    --network host \
    --volume=$(pwd):/site \
    --name hugo \
    $(docker build -q .) "$@";

This wrapper

  1. Runs the necessary Docker command to hook the image into the host network so I can check my changes on http://localhost:1313
  2. Shares the working directory into the expected /site working directory on the image.
  3. Passes in whatever argue I pass in.

I set this Hugo file to executable with chmod u+x hugo and I can now run the automatically updating Hugo server with

./hugo server

Now because the command hugo by itself is used the build the site, I now just pass in a harmless switch like -v (verbose) to build the site without triggering the default --help text.

Finally I use my previous ./deploy script to rsync the files to my host.

The two new files are in my personal-chronicle github repo for any good they can be to anyone, and I’m curious to know if there’s any way I can improve the Docker build to simplify it.

Some questions or areas I think I can improve are:

  1. I’m not sure if the line ARG HUGO_BUILD_TAGS is necessary. It just happened to be there when I finally got it working, after removing other lines that were causing it to fail.
  2. I’m getting the hugo source from github.com/gohugoio/hugo/source when the Hugo documentation says the main repo root is what you’d use to install it. I’m not sure if there was a better way to go get the Hugo project.
  3. I think I’d prefer to freeze the version of Hugo at the current version until I choose to upgrade after testing. I’m not sure how to ‘go get’ a specific version of the git repo.
  4. Is the RUN apk del git line necessary if I’m using a throwaway build image?

The thing that blows me away about Docker and Golang and a lot of modern developer technology is just how much “standing on the shoulders of giants” I’m able to do. Docker is not just a clever idea, but such a well built stack that even with a rudimentary understanding of what I wanted to achieve, I was able to do it with a few lines of code. And the Go ecosystem meant that go get etc.. pulled an entire projects worth of dependencies and built the entire Hugo app inside a black box. This is such a far cry from past experiences I’ve had trying to build software from source that I can only express gratitude for all the hard work donated by so many.

Hugo Missing Posts

Just to help out anyone else who’s brain is turning to mush trying to figure this out:

I went to write a new post tonight and discovered that Hugo flat out refused to render the new post. Nothing I did would make Hugo display or serve up the new page. Digging further, Hugo wouldn’t convert the post using the conversion functions, and running –verbose or –debug showed that as far as Hugo was concerned, the new pages flat out didn’t exist.

I was wracking my brain for hours over this - checking and double-checking paths, ensuring file permissions were correct, removing old posts, attempting to disable caches - until I did some frontmatter splitting and discovered that if I left the date field off, the posts showed up.

It turned out that something in my config has changed since I created my last posts, and Hugo is now rendering posts relying on the post’s timezone to determine when it should be published. I recently updated my version of Hugo to v0.54.0, and I also specifically altered the way my dates are generated in archetypes\default.md to match how I’d like them to be stored, and one or both of those changes meant that Hugo went from generating new posts in a consistent timezone, to generating them in my local timezone but acting as though they were in UTC. This meant that posts were being ignored but would have published without a problem +1030 hours from the time I wrote them.

Now I’ve simply altered my archetype to include -0700 to tell Hugo to append my local TZ to new dates, and now a hugo new posts/whatever.md generates a file that shows up immediately when I serve the file.

In my archetypes\default.md file I’ve set date to:

date: "{{ dateFormat "2006-01-02 15:04:05 -0700" .Date }}"

Posting from Mobile

One thing that moving away from WordPress means is that I can no longer publish on the go.

I mean, I never really did, but at least I had the option. Now to post I must be in front of my PC with the Hugo software installed and a copy of my repo. I could get the repo on any computer and even install Hugo if I needed to be elsewhere, but my home computer has the key to log into my server, so I’m not making it easy on myself.

I can however, use a portable git client (I’m trying out FastHub for GitHub and write my posts on the go, then tidy and publish them later.

I’m banking on the idea that reducing the barriers to writing will increase the number of posts that get published. We’ll see.

Staticman Comments Are Go

I’ve re-enabled comments here at The Geekorium, and imported all my old comments, so go nuts!

To import all your old comments, I used a script written by someone else, then parsed them through a dodgy PHP script I made myself to rename everything into the format my site is relying on, so there might be shenanigans with the imported comments. Please let me know if anything seems off.

That leaves me with the next question: how do I ensure I don’t get flooded with spam? I’ve had comments back on for all of 2 days, and I get a steady trickle of Pull Requests from the Staticman bot triggered by spam comments. On the Wordpress site I had Akismet turned on, which all but eliminated bad-faith for me, the way modern email clients almost never let the chaff through.

The simplest answer is the Google reCAPTCHKA1 - the latest version doesn’t even ask you to tick the “I’m not a robot” box let alone click on thirteen boxes of street crossings. It’s a tempting solution, but it’s owned and operated by Google, and everything your users do on your website is captured for analysis. As spelled out in their documentation:

reCAPTCHA works best when it has the most context about interactions with your site, which comes from seeing both legitimate and abusive behavior.

Additionally,

reCAPTCHA learns by seeing real traffic on your site.

In a perfect world, Google would only use this data to improve the service. Maybe that’s all they’re doing, but I take my reader’s privacy seriously - more than my own - and I’m genuinely concerned what Google is doing with this enormous corpus of user data capcha’d by these little blue boxes all over the web. They’re more pervasive than Facebook logins and social buttons, and unlike the earlier version, it’s no longer training robots to recognise trains or traffic lights, it’s training computers how to recognise human behaviour.

There’s also the question of how these work if people choose to disable javascript. The theme I’m using relies on more JS than I’d like already, but at least it degrades elegantly. I’m not so sure about recapcha and I can’t find an answer on their website.

It’s looking likely I’m going to have to palm user data off to someone to determine if they’re a robot or not. I’m not happy about it, but it appears to be the price unless I’m willing to sift through dozens of spam comments a day. It wouldn’t be so bad, except Git’s policy of keeping history means that the spam I receive is attached to my site’s repo forever, even if the comment never makes it here.

My final recourse is to try something that I’m guessing won’t work for long. Staticman has a feature that checks for valid form data. The check is basic enough that the field can be present in the data as long as it’s blank. If it has a value set it immediately fails validation. I’ve set a dummy field in the form that needs to be left blank. If a ‘bot fills it in, it should get picked up and fail to submit. I’m not sure how long it will slow them down, but I’m going to give it a shot.

I’ve also disabled the form on posts older than a month, so if you want to comment, do it now!

Update: 24 hours without a spam comment. Success!


  1. https://www.youtube.com/watch?v=WqnXp6Saa8Y 

Moving to Hugo

They say “imitation is the sincerest form of flattery”, and I do hope they’re right. I’ve been reading Rubenerd for the longest time, and his lovely minimal(ist) website built on Hugo has had me dying to try out the technology for the longest time.

While there’s nothing wrong with Wordpress, I’ve always found it just a little too clunky for my tastes - and slow. That might be because I’ve always used it on shared hosting with less than optimised databases. The idea of a super fast and efficient text-only site is appealing.

So if you can’t tell the difference, today’s post (and all past posts) are now brought to you by Hugo, powered by Go.

I also used this as the excuse I needed to finally put the effort into dual booting Linux on my machine. I’m trying out Linux Mint, and I’m proud I actually got it working with Secure Boot[^notalent]. Starting out, my “flow” is to create a post in Markdown, then build the site and rsync it to to the same location my old site was.

Please let me know if you notice anything funky. As usual I can be reached on Telegram, Discord, and just recently, Twitter[^sellout]. However, I’m aware that there’s lots of posts that will not have survived the switch over without some… problems. I will get to them eventually.

The process of moving was interesting. All my posts in Wordpress were written in Textile which for years was my preferred markup language, but Textile turned out to be Betamax to Markdown’s VHS, or what Mercurial is to Git, or what Bitbucket is to Github, or what this sentence is to any other sentence.

The first step was to learn just enough Go to build the Go Wordpress Importer. This pulls all the posts out of a Wordpress Export XML file, then uses Pandoc to convert the HTML to whatever format you like. I built in the ability to toss in some extra Pandoc magic to convert from Textile to HTML then from HTML to Markdown.

From there, Hugo does most of the heavy lifting as long as you can find a theme you like that includes all the nice stuff you want included. I quite like Er but I’ve forked it as ooh-er for my own purposes.

The next step is to build comments back in. It’s something that Ruben has forgone - not for technical reasons I believe - but I really enjoy the one or two I get occasionally. It’s not an easy problem to solve with a static site though, but I think I’ll be leaning on Staticman to add comments into the github repo. I found a slightly different script that also uses Github, but adds comments as “issues”. While appealing, I also want to ensure I’m not tied completely to Github for all time.

Let me know what you think of the changes. I’ll post more when I have comments up and running.

[^notalent]:Through no talent of my own I might add. [^sellout]:God I’m such a sell-out.