When you visit 99designs.com, you're interacting with a collection of web applications running as Docker containers on AWS EC2 machines. We're continually improving those applications, deploying updates many times per day. Our Buildkite continuous integration (CI) setup automatically builds and tests each update as a new Docker image, pushing it to a private Docker registry and deploying it to our production servers. Development of 99designs also happens in Docker containers, using images pulled from a private Docker registry.

Having tried a few Docker registry providers, we've landed on Docker Hub run by Docker, Inc. But when AWS announced general availability of their own EC2 Container Registry (ECR) we were interested in the idea of having our Docker registry running on the same provider as the rest of our infrastructure so that;

  • IAM can be used for access control to our images,
  • fewer third-party organizations have access to our intellectual property,
  • the chance of hacks/leaks impacting us is reduced.

ECR differs from Docker Hub in a number of ways, and one of those differences prompted us to build a new tool that I'll introduce below.

So let's step through some of those differences.

Image naming

Docker image names range from short and simple to long and complex, for example;

  • ubuntu - an official image (no username), and implicitly the latest tag.
  • ubuntu:16.04 - the 16.04 tag of the same image.
  • jess/chrome - an image called chrome published by user jess.
  • jess/chrome:beta - the beta tag of the same image.

None of the above images specify a registry hostname, so they're assumed to be on Docker Hub… one of the benefits of being Docker, Inc.

  • quay.io/username/repo
  • quay.io/username/repo:stable

The above images are much like Docker Hub ones, but on Quay, a third-party private registry.

  • 727283191883.dkr.ecr.us-east-1.amazonaws.com/example
  • 727283191883.dkr.ecr.us-east-1.amazonaws.com/example:release-production-abd9295-f9b8272

The above are image names (untagged and tagged) on Amazon ECR. They're fine for automated process, but unwieldy for humans in development environments.

Authentication

Authentication to private Docker registries is normally done with docker login which writes the credentials to ~/.docker/config.json, where they're used for subsequent push/pull operations for that registry.

As a bridge between this mechanism and AWS IAM, the AWS Command Line Interface has an aws ecr get-login command which, assuming the requesting AWS user/role has the correct access, returns a ready-to-run docker login ... command with generated credentials built in.

The generated credentials expire in twelve hours, after which new credentials must be requested. As with the complex image names, this is fine for automated processes but unwieldy for development environments.

Image storage limits

It's common to continually push new images with new tags to a Docker repository, e.g. build-20170303-152000, build-20170303-153100, etc. Even continually pushing to a single latest tag may lead to unbounded storage of untagged images.

Docker Hub seems to brush this under the carpet, presumably wearing the cost for now. AWS ECR, however, defaults to a limit of 1,000 images per repository. It's possible to request a limit increase, but this highlights the reality that image storage needs to be accounted for eventually.

Introducing aws-ecr-gc

Our solution to staying under the ECR image limit while keeping a healthy number of previous image tags is aws-ecr-gc. It assumes that related tags in a repository will have a common prefix. For example a CI repository may contain build-a92d, build-71ba, build-321d as well as release-latest, release-previous, release-a92d, release-71ba etc.

Given a list of tag prefixes e.g. build and release, aws-ecr-gc deletes all but the newest N images matching those prefixes. Images with tags not matching the listed prefixes are not deleted. Optionally, untagged images are also deleted.

Example; delete all untagged images, delete all but the latest 4 images with tags starting with release-production, and delete all but the latest 8 images with tags starting with build:

$ export AWS_DEFAULT_REGION=us-east-1
$ aws-ecr-gc --repo testrepo --delete-untagged=true --keep release-production=4 --keep build=8
Total images in testrepo (us-east-1): 47
Images to delete (3)
  2017-03-20 03:51:41: sha256:2a1fce5b2... [build-64cd372]
  2017-03-17 17:12:07: sha256:4fe1451fc... [build-1d293f7]
  2017-03-17 16:58:15: sha256:e0a2a1b4f... [build-6d12484]
Deleted (3)
  sha256:2a1fce5b2... (build-64cd372)
  sha256:4fe1451fc... (build-1d293f7)
  sha256:e0a2a1b4f... (build-6d12484)
Failures (0)

Conclusion

AWS ECR is great for automated build and deploy processes, but less convenient for people working with the Docker images. So we've moved our CI and deployment processes from Docker Hub to ECR, but left our developer-facing Docker images on Docker Hub for simpler authentication and image naming.

Today we're releasing aws-ecr-gc under the MIT open source license. Adding it as a CI build step cleans up old images while keeping some recent releases in case rollback or debugging are required.

Check out aws-ecr-gc on GitHub.