When you visit 99designs.com, you're interacting with a collection of web applications running as Docker containers on AWS EC2 machines. We're continually improving those applications, deploying updates many times per day. Our Buildkite continuous integration (CI) setup automatically builds and tests each update as a new Docker image, pushing it to a private Docker registry and deploying it to our production servers. Development of 99designs also happens in Docker containers, using images pulled from a private Docker registry.
Having tried a few Docker registry providers, we've landed on Docker Hub run by Docker, Inc. But when AWS announced general availability of their own EC2 Container Registry (ECR) we were interested in the idea of having our Docker registry running on the same provider as the rest of our infrastructure so that;
- IAM can be used for access control to our images,
- fewer third-party organizations have access to our intellectual property,
- the chance of hacks/leaks impacting us is reduced.
ECR differs from Docker Hub in a number of ways, and one of those differences prompted us to build a new tool that I'll introduce below.
So let's step through some of those differences.
Docker image names range from short and simple to long and complex, for example;
ubuntu- an official image (no username), and implicitly the
16.04tag of the same image.
jess/chrome- an image called
chromepublished by user
betatag of the same image.
None of the above images specify a registry hostname, so they're assumed to be on Docker Hub… one of the benefits of being Docker, Inc.
The above images are much like Docker Hub ones, but on Quay, a third-party private registry.
The above are image names (untagged and tagged) on Amazon ECR. They're fine for automated process, but unwieldy for humans in development environments.
Authentication to private Docker registries is normally done with
login which writes the credentials to
~/.docker/config.json, where they're
used for subsequent push/pull operations for that registry.
As a bridge between this mechanism and AWS IAM, the AWS Command Line
Interface has an
aws ecr get-login command which, assuming the
requesting AWS user/role has the correct access, returns a ready-to-run
docker login ... command with generated credentials built in.
The generated credentials expire in twelve hours, after which new credentials must be requested. As with the complex image names, this is fine for automated processes but unwieldy for development environments.
Image storage limits
It's common to continually push new images with new tags to a Docker
build-20170303-153100, etc. Even
continually pushing to a single
latest tag may lead to unbounded storage of
Docker Hub seems to brush this under the carpet, presumably wearing the cost for now. AWS ECR, however, defaults to a limit of 1,000 images per repository. It's possible to request a limit increase, but this highlights the reality that image storage needs to be accounted for eventually.
Our solution to staying under the ECR image limit while keeping a healthy
number of previous image tags is aws-ecr-gc. It assumes that
related tags in a repository will have a common prefix. For example a CI
repository may contain
build-321d as well as
Given a list of tag prefixes e.g.
all but the newest
N images matching those prefixes. Images with tags not
matching the listed prefixes are not deleted. Optionally, untagged images are
Example; delete all untagged images, delete all but the latest 4 images with
tags starting with
release-production, and delete all but the latest 8 images
with tags starting with
$ export AWS_DEFAULT_REGION=us-east-1 $ aws-ecr-gc --repo testrepo --delete-untagged=true --keep release-production=4 --keep build=8 Total images in testrepo (us-east-1): 47 Images to delete (3) 2017-03-20 03:51:41: sha256:2a1fce5b2... [build-64cd372] 2017-03-17 17:12:07: sha256:4fe1451fc... [build-1d293f7] 2017-03-17 16:58:15: sha256:e0a2a1b4f... [build-6d12484] Deleted (3) sha256:2a1fce5b2... (build-64cd372) sha256:4fe1451fc... (build-1d293f7) sha256:e0a2a1b4f... (build-6d12484) Failures (0)
AWS ECR is great for automated build and deploy processes, but less convenient for people working with the Docker images. So we've moved our CI and deployment processes from Docker Hub to ECR, but left our developer-facing Docker images on Docker Hub for simpler authentication and image naming.
Today we're releasing
aws-ecr-gc under the MIT open source
license. Adding it as a CI build step cleans up old images while keeping some
recent releases in case rollback or debugging are required.
aws-ecr-gc on GitHub.