The now–finalized HTTP/2 specification has rightfully garnered a lot of interest from the web performance community. The new protocol is aimed at addressing common network performance issues with the aging HTTP/1.x protocol, whilst preserving the existing semantics.

We began a small-scale rollout for static assets earlier this year. After building confidence in our new infrastructure, we began transitioning our static assets to HTTP/2. Surprisingly, some sections of our platform felt noticeably slower. This post will cover our investigation into the performance regressions we experienced by adopting HTTP/2.

Our story isn’t the panacea of web performance typically associated with HTTP/2. We hope sharing our sobering experience will help to balance the discussion.

Why HTTP/2?

For better or worse, the story of HTTP/2 has become tied to notions of free performance and how it will make everything we know about web performance wrong.

In reality the performance story of HTTP/2 is one of nuances.

Unlike HTTP/1.x which creates a new connection per resource, HTTP/2 creates at most a single connection per hostname. That connection is a multiplexed stream utilizing a binary framing protocol. The binary framing is responsible for matching multiple concurrent requests to responses.

diagram of HTTP/2’s binary framing protocol Slide from Ilya Grigorik’s presentation: HTTP/2 is here, let’s optimize!

No longer being limited to one transaction per connection, head-of-line blocking is largely eliminated. Creating fewer connections also means a reduced sensitivity to latency and TCP congestion controls. In combination, these properties can result in big performance wins because they reduce the volume and duration of round trips between server and client.

graph comparing decreases in page load times relative to decreases in bandwidth vs latency https://www.igvita.com/2012/07/19/latency-the-new-web-performance-bottleneck

Monitoring the performance of HTTP/2

We use Calibre for synthetic monitoring of end-user performance, collecting a diverse set of metrics. We push a small subset of this data to highly–visible Geckoboards throughout our offices.

preformance dashboad 1 preformance dashboad 2 Etsy–inspired performance dashboards in the Melbourne office

We used the following metrics as proxies for user-perceived page load performance and the success of HTTP/2. We chose these specific metrics because they’re affected by different aspects of page load life cycle.

  • The DOMContentLoaded event is delayed by synchronous scripts.
  • The time to first paint is delayed by render–blocking resources like CSS and fonts.
  • The time to visually complete is delayed by non–render–blocking resources like images and potentially asynchronous scripts.
  • Speed Index is affected by the rate of visual completion over time.

Testing and verifying the success of HTTP/2

We started by migrating our image thumbnail CDN to CloudFlare, which provides HTTP/2 out of the box. Initial benchmarks showed CloudFlare’s latency and response times to be comparable to those of our existing CDN.

As a design marketplace most of our pages are image–centric, commonly requiring upwards of 50 images.

Pages with many small resources are adversely affected by minor changes in connection latency with HTTP/1.x. For such latency-bound pages we expected visual completion be reached faster. How much faster would depend on the connection latency and number of images. We expected this trend to continue on high latency, low bandwidth 3G connections.

For bandwidth–bound pages we expected to see no appreciable change.

Enabling HTTP/2 for images alone doesn’t affect head–of–line blocking, so we didn’t expect any changes in time to first paint or DOMContentLoaded.

Reality

The results weren’t as clear cut. Below I’ll dig into some of the nuances, surprises and future considerations for our HTTP/2 rollout.

Testing

We enabled the HTTP/2 CDN behind a feature flag and over the next week we recorded approximately 100 Calibre snapshots with and without the new CDN. The Calibre Chrome agents are US-based with low–latency, high–bandwidth connections.

Case study: Designer portfolio

Designer portfolios are representative of a typical latency–bound 99designs page. Here we observed a 5% improvement in Speed Index and time to visual completion.

The time to first paint was comparable, but interestingly the initial render was more complete with HTTP/2.

filmstrip comparing HTTP/1.x and HTTP/2 page load performance More complete initial render with images served over HTTP/2

Case study: Discover design gallery

Our Discover design galleries are representative of the extremes of our platform. With 80 images, weighing in at around 10mb per page, these pages are bandwidth–bound so the effects from the reduction in latency should be marginal. As such, we expected no noticeable change in performance.

What we actually observed was a 5–10% regression on average in time to visual completion and Speed Index. Overall page load time had decreased however, suggesting that we were benefiting from reduced connection latency.

filmstrip comparing HTTP/1.x and HTTP/2 page load performance Delayed first paint, and visual completion times for HTTP/2

High latency testing

For gathering data on high latency connections we used WebPagetest with a 3G connection profile.

Initial paints continued to be more complete, but happened noticeably later. The overall page load continued to occur earlier.

Counter–intuitively, visual completeness was negatively delayed by an average of 15% for Designer profiles and 25% for Discover respectively.

TL;DR:

For a typical image rich, latency–bound page using a high–speed, low–latency connection, visual completion was achieved 5% faster on average.

For an extremely image–heavy, bandwidth–bound page using the same connection, visual completion was achieved 5–10% slower on average.

On a high–latency, low–speed connection we saw significant delays for page to reach visual completion.

In all tests we saw overall page load times improved, and more complete initial paints.

The postmortem

The data collected left us with one big question

When using HTTP/2, our bandwidth-bound pages take significantly longer to reach visual completion despite loading faster. Why is this?

Hypothesis 1: network saturation

HTTP/1.x traffic is bursty in nature due to opening many short-lived connections. This behavior is responsible for the network waterfall seen in dev tools.

HTTP/1.x staggered network waterfall HTTP/1.x staggered network waterfall

Initially we thought a single, long lived TCP connection loading megabytes of image data could be starving bandwidth from loading layout blocking resources like CSS, JS or fonts.

However, the network waterfalls didn’t reveal any changes to the loading behavior of layout blocking resources.

Hypothesis 2: altered loading priority

When using HTTP/1.x, browsers have a limit of approximately six simultaneous open connections to an origin. As resources are discovered, they’re added to a FIFO resource download queue. Limiting the number of open connections to an origin creates an implicit loading priority of resources.

Each queued resource represents a request–response round trip to an origin that must be completed before the resource can leave the queue. This behavior is what we know as the network waterfall.

HTTP/2’s framing protocol lets the browser stitch together multiple requests and responses, so we lose the document order priority queue.

comparision between HTTP/1.x and HTTP/2 network waterfalls Discover page network timeline for HTTP/1.x and HTTP/2

We considered that the HTTP/1.x best practice of putting <script> at the end of the document could now be doing us harm.

Was everything we know about performance actually wrong?

However, comparable DOMContentLoaded times ruled out that theory. Network waterfalls confirmed that layout block resources were being prioritized over images.

In practice, the resources in the browser’s download queue are prioritized. This is why starting 80 image requests before finding the <script> at the bottom of the page doesn’t delay the loading of the script.

The exact loading behavior of resources is undocumented, unspecified, and constantly changes. However in most, if not all browsers, images have a lower priority than CSS, JavaScript and fonts.

Hypothesis 3: the stream

Without the simultaneous connection limit of HTTP/1.x, the browser was free to load all 80 images at once. The server would then respond to all those image requests simultaneously, and the browser will draw them as they finish downloading. We could confirm this behavior from the network timeline.

HTTP/2 network waterfall for Discover images Discover page HTTP/2 network timeline of the first 20 image requests

The images were still being requested in the document order. Smaller images, however, would finish downloading faster and were therefore rendered sooner. If a larger image happened to be in the initial viewport it would take longer to load, delaying the visual completion.

gif of Discover page load with HTTP/1.x gif of Discover page load with HTTP/2 Comparing Discover HTTP/1.x vs HTTP/2 3G page load

This also explained why time to visual completion took longer as bandwidth got more constrained and had such large variances.

The HTTP/2 fine print

The problem we’re experiencing with the stream is actually a big feature of HTTP/2 that isn’t talked about much.

Ilya Grigorik said it best:

“With HTTP/2 the browser relies on the server to deliver the response data in an optimal way.

It’s not just the number of bytes, or requests per second, but the order in which bytes are delivered. Test your HTTP/2 server very carefully”

Traditionally resources were requested in document order with some heuristics added by browsers to improve performance. This approach has some big problems:

  • the heuristics are undocumented
  • the heuristics differ between browsers
  • the heuristics differ between browser versions
  • the heuristics are general for all sites

Changes to these heuristics would cause page performance to change suddenly, without warning.

HTTP/2 changed the landscape for resource prioritization — the responsibility is now shared between the browser and the server. The browser gives the server hints about priority but it’s the server that’s in charge of what order the bytes are delivered.

This power shift is a double-edged sword.

Resource prioritization heuristics existing in both the server and the client can make the situation even more opaque and fragile. However, making the server authoritative opens avenues for putting developers in charge.

Take aways

Our investigation found that there is no such thing as free performance — something browser vendors have known for a long time.

The pursuit of web performance is one of tradeoffs and nuance.

In image–heavy pages like those we studied, the tipping point for preferring a multiplexed HTTP/2 connection over multiple HTTP/1.x connections is when latency approaches the average download time for the images. At the right mix of high latency and low bandwidth, we could see big wins with HTTP/2 for smaller images.

HTTP/2 implementations are young, and the surface area of the protocol is big:

  • resource weighting prioritization
  • resource dependency prioritization
  • multiplexing heuristics
  • stream and connection flow control

We can expect to see tweaks, optimizations, and inevitably bugs at all these layers over the next couple of years. It’s important we understand the motivations and tradeoffs of new technology so we can accurately separate hype from value.

A note on image optimization

In the case of Discover, improved image optimization would undoubtedly decrease the time to visual completion for both HTTP/1 and HTTP/2 users. Whether HTTP/2 users would see a greater benefit is not clear cut. We also need to consider the effort and overhead of an addressing image optimization solution for our platform.

As a design marketplace the quality our images is a critical concern. We take extreme care because image artifacts from poor or overzealous optimization negatively affect the user experience.

We frequently evaluate the trade–off between stack complexity, cost of processing high volumes of previews, and opportunity cost of other experience improvements for users. Currently edge caching and low–overhead delivery, such as HTTP/2, strike the right balance as we investigate suitable responsive solutions.

Thanks

This post is made possible by technical editing by Andrew Krespanis and Ben Schwarz. Thank–you.