How to improve Docker build from minutes to seconds for Golang apps
Docker is everywhere
Today, Docker is the de facto standard for application packaging. You can use a Docker image to run locally, or easily orchestrate and scale in Kubernetes or Elastic Container Service (ECS) from Amazon Web Services (AWS).
It's hard to imagine modern DevOps without Docker.
But Docker is both platform and architecture-dependent, meaning that it’s not a real virtual machine (VM), just a sandbox on top of the current host.
If you run Docker on Mac M1, it runs on arm64 architecture, and if you run it on intel CPU, it runs on amd64 architecture, for example.
This isn’t a problem if you’re developing on the same architecture as your target, and it wasn’t a problem when all developers had amd64 architecture and all servers ran Linux on the Intel platform. No problem at all until Apple introduced the M-series CPUs. These became widely used in development because the Apple M chip is incredible (I also moved to an Apple M CPU and am very happy with it).
Now when I build a Docker image, though, it builds arm64 images. I can no longer run it on cloud machines unless I purchase ARM machines in a cloud, which at the moment is not popular solution. Anyway, I recommend to review AWS’s Graviton2 option, as it could save you money and improve performance.
Today, if you’re going to build a public docker and publish it on the docker hub, you need to support at least two platforms: arm64 and amd64.
Multi-architecture Docker build
Now we need to build multi-architecture Dockers, which is no problem as we have dockerx and quemu for that.
Let's see how simple a Docker image could look like for the dockerx solution.
And then run:
This takes around 20 to 30 minutes because it has to run twice: once for amd64 and once for arm64.
It will run fast for the native platform and slow for the non-native platform because it runs in a real VM. Docker uses quemu for this, an amazing solution, but one that’s incredibly slooooooooow.
Why is running non-native images not that slow, but when you build it, it is slow? You can run non-native images on your machine, but if you’re on a Mac, all images are non-native because they run on Linux. Just remember we have two variables here: platform (linux, mac, windows) and architecture (i386, amd64, arm64, arm7 …). If one of the variables does not match your host’s parameter, docker is running inside VM. Even if you have amd64 but running it on mac, you will still have VM. It is because the majority of base docker images are using Linux platform.
Although it is possible to use MacOS platform for Docker images, nobody does: maybe there are some legal restrictions, or maybe there are not many MacOS-powered clouds.
Ok, it looks like the issue is not virtual machine itself, as on a Mac all docker images running in VM. But when we run the image - it is fast, but when we build the image in VM it is slow.
Why is that? Because Docker uses LinuxKit, which uses the HypeKit framework, which uses MacOS’ native (and very fast) Hypervisor framework to run docker images.
That is why you will not feel any difference when running non-native images. In the Docker dashboard, you can see the orange AMD64 badge, which indicates that the image doesn’t support your native architecture.
So Docker uses one VM to run images and another to build them, possibly for simplicity. A slow VM is better than nothing. The Docker team made an amazing effort to bring Docker to the Apple M platform upon its release, migrating from VirtualBox to HyperKit — yes, Docker recently used headless VritualBox. Don't expect them to do everything at once.
Also, it’s not a problem for interpreters or byte-code languages. As for NodeJS, you don't care what platform and OS it is, you just feed your TS code to the platform's node binary. Bundler can run on any platform.
The problem comes for real programmers working with Go, Rust, C++ and C (ok ok, no more jokes about JS developers).
Multi-stage Docker image
Don't want your Docker image to be elephant-sized? Neither do we.
Luckily, Golang has fantastic cross-compile capabilities out of the box.
First, let's split our Docker image into multi-stage dockers. We do this because if we do our first docker we put all our source code into our Docker image, which can cause problems later. Which is not a good idea unless you are a JS developer (could not keep it inside, sorry 😁).
Now you have two stages, the first, called builder, is based on the Golang image which has all the toolchains for Go included. But the second stage is a naked alpine image. It is an extremely lightweight image, only 3MB!
The second thing is that the final Docker image will not contain anything from the first stage. We are not packing anything from the build stage. All we are copying is a Golang binary:
And the Golang binary is also small, just a few MB.
As a result, our final Docker image will be 10 to 20 MB! The regular NodeJS container is more than 1000 MB, if the developer is not experienced with Docker. Some people could reduce it to 100 to 200MB.
Why does size matter? Because the orchestration service could make your Docker run much faster and use fewer resources. You won't notice the difference if you have one or two instances running, but if you have hundreds or thousands…
Go cross-compile: Making multi-architecture builds simple
Can we avoid using quemu? We need a target platform-compatible binary, and to feed it to the target image.
We can do that because Go can cross-compile, but we need to consider the target architecture.
Only two need to be changed:
As you can see, we use TARGETOS and TARGETARCH from the dockerx command and then feed them to the Go compiler with env variables. Go can be built for different operating systems and architectures.
It works because Docker will run the builder platform on the native arch.
When Docker hits this line:
it will split the execution between all the requested OSs and archs. In our case, "go mod download" will run once, but "go build" will run twice: for amd64 and for arm64.
The second stage is packed for each target as soon as we have a base image for that target: Alpine supports seven major architectures and only one platform: Linux. The main reason is because Alpine itself is a Linux.
Personally, I have not seen a Docker image targeting Windows non-Linux platforms, but there are some Windows-specific applications, like sqlserver, or Unix platforms. So it might work with other operating systems.
One more note, when you build Go, CGO is enabled, but when you cross-compile Go, CGO is disabled by default. And there is a reson for that: CGO is not a Go.
If you need CGO, just add the following line:
Here’s what I built with quemu in github action:
As you can see, the native platform runs for 3 minutes, while the non-native (arm64 for Action Runner) runs for 39 minutes.
And this is what it looks like for the cross-compile build:
Only four minutes, without a cache. Next time it will be faster, because of the cache.
Final tips for smooth sailing
Docker is the superstar of DevOps, but it's not without its quirks. To make life easier:
1. Slice and dice your Docker images with multi-stage magic.
2. Let Go do the heavy lifting with cross-compilation - it's like baking different cakes for everyone!
3. Don't forget CGO, if you have to use it.
4. Say hello to multi-platform Docker images, they're the cool kids in town, catering to all major architectures.
With these tricks up your sleeve, Docker deployments will be a breeze.
Hugs to everyone and happy coding, even for JS developers.