Dockerised Hugo for Local Development

Following on from last night’s post, I needed a way to run Hugo to build the new entry and deploy it. Since I had to rebuild my environment from scratch I wanted to see if I could run Hugo and Go without installing them locally.

I know Go is unlikely to cause any stability issues, as it installs all its dependencies in the user’s home dir, rather than touching system files but I’m determined in my experiment to keep my new install as clean as possible.

Using some insight I’d gathered from using docker-tizonia a Docker version of Tizonia and using asolera’s Golang minimal Dockerfile image as a base, I was able to put together a minimal Dockerfile that does the following:

  1. Creates a golang based build image to pull down the latest version of Hugo.
  2. Build and install the Hugo binary
  3. Copy the binary to a clean image
  4. Set the image work directory to /site
  5. Expose the Hugo server port 1313
  6. Make Hugo the entry point and default to the help text if I forget to add a command.

The Dockerfile looks something like the following:

FROM golang:1.14.3-alpine3.11 AS build

RUN apk add --no-cache git

ARG HUGO_BUILD_TAGS

RUN go get -v github.com/gohugoio/hugo/source
WORKDIR /go/src/github.com/gohugoio/hugo

RUN go install

RUN apk del git

FROM alpine:3.11

COPY --from=build /go/bin/hugo /usr/bin/hugo

RUN mkdir /site
WORKDIR /site

# Expose port for live server
EXPOSE 1313

ENTRYPOINT ["hugo"]
CMD ["--help"]

Also thanks to jojomi of bits cribbed from their Hugo Dockerfile.

Many of the Hugo Dockerfiles I found would copy the website source to the container in preparation of serving the files from Docker. In my case I’m happy with my plain HTML to continue being served where it is, but didn’t want to lose out on the features you get when you’re using Hugo to develop locally - such as running a test server with live reloading.

With the help of a handy “hugo” wrapper shell script, I was able to fire up Hugo in the container, and serve my local files through a mapped volume with no appreciable difference to how Hugo was running for me before.

The wrapper is as follows:

#!/bin/bash

docker run -it --rm \
    --network host \
    --volume=$(pwd):/site \
    --name hugo \
    $(docker build -q .) "$@";

This wrapper

  1. Runs the necessary Docker command to hook the image into the host network so I can check my changes on http://localhost:1313
  2. Shares the working directory into the expected /site working directory on the image.
  3. Passes in whatever argue I pass in.

I set this Hugo file to executable with chmod u+x hugo and I can now run the automatically updating Hugo server with

./hugo server

Now because the command hugo by itself is used the build the site, I now just pass in a harmless switch like -v (verbose) to build the site without triggering the default --help text.

Finally I use my previous ./deploy script to rsync the files to my host.

The two new files are in my personal-chronicle github repo for any good they can be to anyone, and I’m curious to know if there’s any way I can improve the Docker build to simplify it.

Some questions or areas I think I can improve are:

  1. I’m not sure if the line ARG HUGO_BUILD_TAGS is necessary. It just happened to be there when I finally got it working, after removing other lines that were causing it to fail.
  2. I’m getting the hugo source from github.com/gohugoio/hugo/source when the Hugo documentation says the main repo root is what you’d use to install it. I’m not sure if there was a better way to go get the Hugo project.
  3. I think I’d prefer to freeze the version of Hugo at the current version until I choose to upgrade after testing. I’m not sure how to ‘go get’ a specific version of the git repo.
  4. Is the RUN apk del git line necessary if I’m using a throwaway build image?

The thing that blows me away about Docker and Golang and a lot of modern developer technology is just how much “standing on the shoulders of giants” I’m able to do. Docker is not just a clever idea, but such a well built stack that even with a rudimentary understanding of what I wanted to achieve, I was able to do it with a few lines of code. And the Go ecosystem meant that go get etc.. pulled an entire projects worth of dependencies and built the entire Hugo app inside a black box. This is such a far cry from past experiences I’ve had trying to build software from source that I can only express gratitude for all the hard work donated by so many.

Starting the Google Treasure Hunt: How to do the first three questions

Google has been running a “treasurehunt” with unspecified prizes up for grabs for the fastest times to complete the tasks. What it amounts to is four questions spaced over a month, that you can log onto in your own time and attempt to nut out. The questions are randomly generated, but are of a different type each week. The first week was a number/maths problem, the second was a computer/file problem, the third a logic (and slightly computer network) problem. I haven’t seen the final fourth question yet.

I thought I’d jot down how I answered the questions - not my particular answers - as they will be different for other people, but the methods for answering the questions which are the same for each of the four question types.

Please note that I cannot finish the treasure hunt. I got stuck at question four. I wrote up how I did questions one to three before I’d done four and it just seemed like a waste not to post them.

Question 1

Question 1: Robot A robot is located at the top-left corner of a X × Y grid (marked ‘Start’ in the diagram below).

The robot can only move either down or right at any point in time. The robot is trying to reach the bottom-right corner of the grid (marked ‘Finish’ in the diagram below).

Note: The grid below is 7x3, and is used to illustrate the problem. It is not drawn to scale.

Google Treasurehunt Question
        1
*Image not to scale.

How many possible unique paths are there?
(Note: Answer must be an exact, decimal representation of the number.)

I used to have a book of puzzles and math problems that had a section about just this type of number problem. Unfortunately I could never make sense of it - it used factorials and a fancy equation that told us the answer, but the author never explained how he got the numbers that he plugged into the equation. Fortunately I have Google. With a bit of research I found out from Dr. Math that the formula is:

              n!
C(n,r) = -------------
           r!*(n-r)!

where n is the total number of moves along one path it would take, and r is the number of moves along either axis (accross or down in this example).

The grid size I was given was 66 × 46. Starting in the first square, the robot can move either down or right - and will therefore take 65 moves right and 45 moves down to get to the star-square, regardless of what order he takes them in. That’s a total of 110 moves (down or right), and I can plug those numbers into the formula to get:

               110!
C(110,65) = -------------
            65!*(110-65)!

The problem is, most calculators can’t handle those numbers. The = 110 × 109 × 108 × 107 … etc. This can generate very large numbers very quickly. Asking the Google calculator what the answer is gave me a very large number indeed, and did what most calculators do with these sorts of numbers and freaked out, giving me this answer:

110 ! / 65 = 1.60981337 Ã - 10^31^
<a href=“http://www.google.com/search?q=110!/(65)&hl=en&safe=off”>Google’s unhelpful answer

What I needed was a whole number - a large whole number to put in the answer box. So I went looking for an online calculator that can handle large numbers. I found one built using Javascript, and one that runs in a Java applet. I found the javascript calculator more intuitive and easy to use, but the final step of dividing one huge number by another kept timing out on me, so I switched to the Java one. I used a lot of copy pasting to keep the numbers clear in my head - pasting them into a text document and switching between the two calculators to check and recheck my answer. Eventually, I had a number to paste into the answer box that matched what Google almost told me earlier:

110 ! / 65 ! × 45 ! = 16098133690909086126971454271170

I submitted it and a few minutes later, was told I was correct. A week later I tried question 2.

Question 2

Question 2: Zip Here is a random zip archive for you to download:
GoogleTreasureHunt08_longstringofrandomnumbers.zip

Unzip the archive, then process the resulting files to obtain a numeric result. You’ll be taking the sum of lines from files matching a certain description, and multiplying those sums together to obtain a final result. Note that files have many different extensions, like ‘.pdf’ and ‘.js’, but all are plain text files containing a small number of lines of text.

Sum of line X for all files with path or name containing 999 and ending in .abc
Sum of line Y for all files with path or name containing 000 and ending in .xyz
Hint: If the requested line does not exist, do not increment the sum.

Multiply all the above sums together and enter the product below.
(Note: Answer must be an exact, decimal representation of the number.)

Where the bold numbers/letters above change randomly from question to question.

I downloaded the zip file provided and extracted it (with paths) to my hard drive. I initially tried to find a way to automatically find the files specified using Windows command line tools, but they are very limited (or I just don’t know where to look). What I was looking for was any file with a ‘.xml’ extension that resided in any folder (even in subfolders) with ‘zzz’ in the name, or that had the filename ‘zzz.xml’. Same again for any file with a ‘.pdf’ extension that had ‘EFG’ in the path. then I had to extract the numbers on the fourth line of all the files I found. Fortunately I have access to unix tools - Cygwin for Windows at work, and linux and mac machines at home. With a bit of “research”, and the unix ‘man’ pages, I strung together a command that would print me a list of numbers.

find . -path *zzz**xml -print0 | xargs -0 -I xxx sed -ne '4{p;q}' xxx

The command works this way:

  • find is the command to look through a file list.

  • I started in the directory I had extracted the files to, which is why I used . as the search path.

  • Then the ‘-path’ switch finds any path that matches the regular expression following it. The expression I used was abc*xyz, where abc is the path code we were supposed to find and xyz is the extension.

  • -print0 just outputs the files it finds.

  • The files then get piped |

  • into the xargs command, which is used to run the sed command on each file in turn.

  • The sed command is then given:

    • the -n switch to make sure sed doesn’t edit the file.

    • -e to run the sub-script 4{p;q}, where 4 is the line number to extract, p prints the contents on the line, and q tells the sub-script to quit there, rather than keep going unnecessarily. I’m not sure the -e switch is necessary, but I kept it in from the helpful fellow who suggested it on a forum discussion from 2004 that I can’t find again.

    • xxx to say “use the file you found in the xargs part of the command.

    I’m not sure that my analysis of the sed command is accurate. Please correct me if I’m missing something.

It’s a relatively simple command, but it produces a list of every 4th line in every file that matches the criteria.

There’s probably a unix-hackery way to then get the sum of those numbers, but I just copied and pasted them into an Excel spreadsheet and used the =SUM (A1:A10) to add them all up. Then repeated the process with the next set of criteria and multiplied the two number I got at the end to give me the answer. I did it a second time to produce another answer in a faster time, and waited for Question 3.

Question 3

Question 3: Network Below is a diagram of a computer network. The nodes are hosts on the network, and the lines between them are links. A packet is sent out from host X with a destination of 111.222.333.444. Which nodes does the packet pass through on its way to the destination? (include start and final node in your answer)

And here’s an image of the second question I attempted (click to zoom):
Google Treasurehunt Question
        3

After freaking out a little at how complex it looks I started brushing up on my routing table knowledge. It quickly dawned on me however this puzzle is nothing like a real-world networking problem, and is entirely about logic and following a path.

Again, there’s probably a unix-y way of doing this, maybe a perl script or something that could look through this grid of numbers and trace a path around the diagram. I chose to do it the easy hard my way.

What I’m doing is following the logic of the routing table. At each node you can get only one step closer to the node you want, and each node has instructions for where to send the information next. In this example if I start at the top in row 1 (for node A) I would read this as:

  • if I want to get to 203.21.93.108, then go to 203.21.93.108;

  • if I want to go to 48.251.60.80 go to 100.5.76.120;

  • if I want 8.32.24.0/24 go to 161.212.157.20;

  • and if I want to go anywhere else, go to 17.29.203.130.

Of course, you don’t start at node A, you start where you need to start

  • in my case node G. I write “G” in the answer box and start.

Finding the starting point: G (in this example), I looked across the columns until I found reference to the end point I was trying to reach: 203.21.93.108. On the table, the second routing table entry in row ‘G’ has “203.21.93.108 => 8.32.24.60”. This means that for the information to get from G to 203.21.93.108, it must first go through the node with the IP address of “8.32.24.60”.

“8.32.24.60” is node “F”, so I write down F in the answer box and keep going.

Node F directs me to “234.129.94.133” which is node “I”, and so on I go. It’s like a numerical choose your own adventure. Without the choice. Or the fun. On occasion, there are nodes that look like the next logical step would be to finish the puzzle, because it is right next to the finishing node, but if the rules don’t tell you to finish, you have to keep going.

Following this to completion gives a string of letters like this: GFIHDCLONAM corresponding to the path taken through the nodes. This gets me through this round and on to question 4.

Question 4

Question 4: Primes
Find the smallest number that can be expressed as
the sum of A consecutive prime numbers,
the sum of B consecutive prime numbers,
the sum of C consecutive prime numbers,
the sum of D consecutive prime numbers,
and is itself a prime number.

For example, 41 is the smallest prime number that can be expressed as
the sum of 3 consecutive primes (11 + 13 + 17 = 41) and
the sum of 6 consecutive primes (2 + 3 + 5 + 7 + 11 + 13 = 41).

Where A, B, C, and D are uniquely generated for each question.

Question four is here to separate the boys from the men. Unfortunately in terms of maths and programming/scripting - my voice hasn’t dropped. I cannot continue giving any help, and I’ve had to bow out of the hunt. I know I could do the question with the help of the internet, but frankly

  • I was only doing it to feel smart…

Anyway - if you’d like a solution to the fourth question you can get it here: Solving Google Treasure Hunt Puzzle 4: Prime Numbers. His solution is particularly good - in that he uses a lot of command line nous to get the job done. A few commenters say they’ve used python/perl/java programs to do it all from scratch, but I love that he uses a bit of unix giggery-pokery to get a useful file that he then just extracts the answers from. He also hasn’t tried the first three questions yet - so I’m not linking to someone who’s gonna show me up with their first three answers!

Have fun.