Too Chunky: Performance and HTTP Chunked Encoding

Posted: May 15, 2012 at 4:25 pm

While debugging a customer issue this weekend, I uncovered a problem with chunked encoding in general, and ASP.NET in particular, that can reduce your website’s performance.

Let’s start with some background.

Digicure is a web security and performance services company in Denmark. They are also a Zoompf customer. At the end of last week, they contacted Zoompf support to tell us that some of our Zoompf WPO pages were timing out. Zoompf WPO is our web performance scanner delivered as a SaaS. User’s log in to the web interface and can conduct performance scans, review scan results, and generate reports. Zoompf WPO’s web interface is written in ASP.NET. This is largely because our performance scanner is written in C#.

By default, ASP.NET does not use chunked encoding. When an HTML page is being dynamically generated, ASP.NET buffers all of the output, and sends all of the content at once. This response includes a Content-Length header because the entire response is created before being delivered to the client, so the web server knows how long it is. This is called the Store-and-Forward approach.

Store-and-Forward Vs. Chunked

Store-and-Forward is not necessarily bad. In fact, it’s how HTTP/1.0 transmits dynamic responses when Connection: Keep-Alive is used. But Store-and-Forward does not create the ideal the user experience. This is because no content is sent to the user until the application tier has finished generating the markup. This means the user sees no content. More importantly, the web browser doesn’t have any HTML yet, so it cannot start downloading other like CSS or JavaScript files while the HTML loads.

A better approach is chunked encoding. Chunked encoding was added in HTTP/1.1 and allows the web server to stream content to the client without having to know how large the content was ahead of time or having to close the connection when it’s done. We can see how the chunked encoding approach compares to Store-and-Forward in the figure below:

Chunked encoding is great, because the user starts getting content almost immediately. The application is faster because the browser can start to download other resources while the HTML is still being generated and streamed to the client.

Since chunked encoding can improve performance, I looked for appropriate places in Zoompf WPO’s web interface to use it. Now, there are a few areas of WPO where generating the HTML can take a long time. One is when WPO is generating the list of affected URLs for a specific performance issue. This list can be thousands of items long. Streaming this HTML to the client using chunked encoding provides a better user experience than completely generating the HTML and then delivering the page to the client. So I disabled output buffering for areas like that and ASP.NET uses chunked encoding to send pieces of HTML to a visitor as they are generated. I thought everything was fine.

Too Chunky

That is, until I heard from Digicure. They experienced pages which were loading slowly and would sometimes timeout. Specifically, lists of the affect URLs were appearing very slowly. I did not see this behavior when I tested the application, and no other customers had this issue. A quick check showed the web server and database were not under excessively load. Network checks showed there was plenty of available bandwidth to transmit the data quickly.

I decided to see the traffic the web server was actually sending However I did not use an HTTP proxy because I wanted to make sure that trying to measure what was happening did not affect what was happening. Instead I used Wireshark to capture the HTTP traffic between the browser and web server while fetching the slow pages. Here is what I saw:

2f
<a name="affected"><h2>Affected URLs</h2></a>

12 <div class="nb">
17 <ul class="url_list">
4 <li>
9 <a href="
34 ShowResponse.aspx?scan=7519&amp;got=96&amp;check=300
2 ">
1a http://XXX.XXXXX.XXX/de/de
6 </a>
7 </li>
4 <li>
9 <a href="
36 ShowResponse.aspx?scan=7519&amp;got=1659&amp;check=300
2 ">
...

That shows some of the response body, encoded into chunks. The problem is each chunks is really small. As in, just a few bytes small. And there are so many chunks. Way too many. And then I noticed something with a sinking feeling: the way content was divided into chunks looked familiar. Oh crap, I know what the problem is. I went and looked at the source code generating this list:

if (!alreadyShown)
                {
                    fout.WriteLine("<div class=\"nb\">");
                    fout.WriteLine("<ul class=\"url_list\">");
                    foreach (IBasicItemInfo info in infos)
                    {
                        fout.Write("<li>");
                        HtmlUtils.RenderInternalLink(fout, "ShowResponse.aspx?scan=" + scanID + "&got=" + info.ID + "&check=" + issueID, StringUtils.Truncate(info.Name, 256));
                        fout.WriteLine("</li>");
                    }
                    fout.WriteLine("</ul>");
                    fout.WriteLine("</div>");
                }

See the problem? With buffered output disabled, anytime the application writes bytes to the response, those bytes are immediately sent to the client. Even if you are just writing a simply <li> tag! This HTML response is around 300 kilobytes, and ASP.NET is streaming that just a few bytes at a time. Really big pages sending all those chunks over a high latency connection like to Digicure are going to be slow.

There is another performance problem with overly “chunky” responses. Chunked encoding adds overhead. For each chunk, there are a few bytes to represent the length of the chunk, and then 4 bytes to represent two CRLF sequences. For small chunks, like sending an <li> tag, the overhead of the chunk is larger than the data in the chunk! For some pages Zoompf WPO was sending 75 kilobytes of chunked encoding overhead to transmit 300 kilobytes of data!

Control over this is very limited in ASP.NET. In fact, it seems to be Boolean. Either output buffering is enabled, and no content is sent to the client until everything is generated. Or output buffering is disabled, and every function call to write results in a chunk. There was no middle ground. I’m not an expert in IIS or ASP.NET and its possible I’m missing something, but Google queries returned no useful information about this.

In the end, I disabled buffered output and implemented my own output buffering. Now, using fout.Write() or fout.WriteLine() calls my class instead, which buffers data into 8 kilobyte chunks before sending the data to the client. Ideally, I should adjust the size of the buffer dynamically, based on how large I think the response is going to be. This is reasonable in WPO, since I know how many links will be in the affected URLs list, but isn’t always possible.

Testing Your Website for “Chunky-ness”

This isn’t exclusively an ASP.NET problem. At the end of the day, the problem Digicure experienced which resulted in slow pages was that the response was sliced into too many chunks, each of which were very small. ASP.NET just exasperated the problem by chunking every call to Write or WriteLine. This problem inherently exist in other frameworks, and can exist if your application tier excessively flushing the output, resulting in a large number of small chunks.

How do you detect if a website has this problem? That’s a little trickier. As I said, most web traffic tools like proxies or plug-ins normalize the traffic before showing it to you. The uncompress it or they de-chunk it or normalize HTTP headers. You can’t count the chunks if you can’t see them. I suggest using Wireshark. This allows you to capture all of the traffic as it is that actually transmitted and received by a network interface on your computer. (Note that on Windows, Wireshark cannot capture traffic for localhost).

Here is how you can see if a webpage is “too chunky”:

  1. Load the page you want to test in your web browser.
  2. Select an interface in Wireshark and start a capture.
  3. Reload the page in the browser window.
  4. Stop the capture in Wireshark and enter “http” into the filter to exclude everything accept we traffic.
  5. Find the HTTP request, right click, and select “Follow TCP Stream”, as shown below:

If you see a large number of chunks, or very small chunks, examine your application code to see if you are flushing the output stream excessively, such as instead of a loop.

Conclusions

Chunked encoding is awesome. It allows us to transmit variable length content and use persistent connections. However each chunk has overhead, and you need to ensure your application isn’t too chunky.

I wish I could tell you that testing for “chunky-ness” is one of the 400 checks Zoompf can test your website for. Unfortunately that’s not currently the case. The HTTP library we wrote doesn’t expose chunk information, so we cannot detect this. Hopefully this is something we can change in the near future. Until, use Wireshark to test your website, and Zoompf’s free or paid offerings to find other types of performance problems.

HTTP Compression use by Alexa Top 1000

Posted: May 2, 2012 at 6:01 pm

Yesterday, frontend madman and performance nut Paul Irish reached out to me asking if I had any stats on the use HTTP compression.

I’ve written a bunch about the benefits of HTTP compression, as well as the challenges in implementing it. Surprisingly, I realized that, no, I did not have any figures about HTTP compression usage by major sites. The most recent stats I had were from the talk I gave during Velocity 2011′s Ignite sessions. The more I thought about it, I saw there were lots of interesting stats to gather beyond a raw “X number of sites use compression”. I decided to survey the top 1,000 Alexa websites and get a deeper understand of how they are using HTTP compression. The results reveal how large websites manage their content and apply the most basic web performance optimization.

Methodology

How could I get data about how HTTP compression is used by the top websites? I could certainly pull from Zoompf’s database of free scans or our paid customers, but this is not a great sample. It contained lots of the same sites scanned over and over again, and all the scans are made by people who know about performance and are actively trying to improve it. It didn’t really matter. All of the Alexa Top 1,000 websites are sadly not Zoompf customers, (yet!), so I wouldn’t even have the data anyway.

My next thought was to use the awesome HTTP Archive. After all, they’ve done the hard work of fetching all the content and provide HAR files for each site. Sadly the HTTP Archive can’t help me here for a few reasons. The first reason is that HAR, while a helpful format, has short comings. The biggest is that the response bytes are not included in the HAR file. This is a critical requirement. I wanted to determine what types of content are or are not getting compressed. The MIME type in the Content-Type header is not a reliable at all, so I would need the response bytes to determine content type.

Another reason for needing the full responses including headers and body bytes is the type of analysis I needed to do. Understanding how top website use HTTP compression is much more involved than simply checking if, say, an HTML file is lacking a Content-Encoding header. For example, HTTP compression can make certain responses larger. To determine if a website is using compression properly, I need to determine if responses that were not served using HTTP compression would truly be smaller if they were compressed. This means that I needed the response bytes to compress and see the result.

In the end, there was no getting around the fact that I needed to download the actual content from Alexa’s top websites. So how do I do it?

I started by downloading Alexa’s Top 1,000,000 sites list. This list is downloadable as a CSV file, so I needed to strip out just the hostnames, and I only wanted the first 1,000. This was accomplished with the following Linux/Unix/Cygwin commands piped together:

$ cat top-1m.csv | cut -d"," -f2 | head -n 1000 > top-1000-sites.txt

This gave me a list of just the host names of the top 1,000 sites on Alexa’s list. Next, I used an internal tool we have at Zoompf for running our performance scanner against a list of hosts. For each host, the scanner visits the home page, and downloads all dependent resources like CSS, JavaScript, images, fonts, Flash and more. This simulates exactly what a visitor’s web browser would do when accessing the main page for the site. I ran the bulk scanner on the list of the Alexa top 1,000 sites to download all this content. This took about an hour and half from my dev box. Then I analyzed the data. Basically, this involved opening the scan data for each scan, and examining each response. In total, I examined 90,517 responses served from 4,597 distinct hostnames.

First I needed to determine which responses could be compressible. This means I was looking for a response whose format is not natively compressed, and so it could be compressed with HTTP compression. As I discussed in the Lose the Wait: HTTP Compression post, this includes more than just text responses like HTML or CSS. Luckily we have built a pretty large database of common web file formats internally at Zoompf, which includes an attribute about whether the format is natively compressed. I simply examined the bytes, determined the file format, and did a quick check to see if it was natively compressed. I also took various measurements including how much content could be compressed, what its type was, and where it came from.

Big Questions To Answer

I wanted to use all this data to answer some big questions. Specifically, I wanted to know:

  • How much of the content was compressible, and how much was actually getting compressed?
  • What types of files get compressed more or less often than others? For example, are most people compressing HTML but forgetting CSS?
  • How are websites doing as a whole? Are some websites applying compression perfectly and some completely failing? Or is it that all websites are overlooking one or two little things?

Ultimately, I wanted to use these answer to try and determine broader answers about how performance optimization is implemented at big companies and why.

The Findings

Pie Chart of compressible content

Of the 90,517 responses I examined, only 14,316 responses (15.81%) were compressible. This is an interesting stat, because it goes to show how much of the web is dominated by binary content like images. This is why I’m a big proponent of image optimization, and it’s nice to see the topic of image optimization on the radar of more mainstream tech bloggers like Daring Fireball’s John Gruber and Jeremy Keith. As I said in my Take it all off: Lossy Image Optimizations talk at Velocity 2011, a 20% reduction in image size has more of an effect on total content size than an 80% reduction of text content.

Let’s dig into those compressible responses. There are 3 categories that a compressible response can fit in:

  1. Properly Compressed Responses – A response that could be HTTP compressed, and was in fact served to Zoompf using HTTP compression. For example, a CSS file which is served with HTTP compression. This is a good category, since the content owner is optimally serving the content.
  2. Properly Uncompressed Responses – A response which is not natively compressed, but compressing it makes the response larger. For example, a small HTML file which is actually larger when served compressed. This is a good category, since the content owner is optimally serving the content.
  3. Responses Missing Compression – A response which is not natively compressed, which should be compressed, and which will be smaller if HTTP compression is used, but which is not compressed. For example, a SVG image served without compression, since SVG images are not natively compressed. This is a bad category, because the content owner is inefficiently delivering content to the client.

You can see the breakdown of responses in the graph and table below:

how compressible content is served
Type# of Responses
Properly Compressed8825
Properly Uncompressed2144
Missing Compression3347

So, across all Alexa Top 1,000 websites, 23.37% of content is not getting compressed. The median savings if compression was used would be 4.4 kilobytes with is a median savings of 62.3%. That’s quite a bit of savings, considering HTTP compression has been a well known, supported, and recommend optimization for over 15 years.

How are sites themselves doing? Of the Alexa top 1,000 websites, 642 of them are serving at least 1 item that is missing HTTP compression. In other words, 64% of the Alexa Top 1,000 are not properly applying HTTP compression.

The median number of responses missing HTTP compression per website was 2, so it’s not like the occasional, single response is getting through without compression. This is a larger issue.

To try and understand why content is slipping through, we first need to know which types of content isn’t being compressed. The table below shows this.

File TypeTotal ResponsesMissing Compression% Missing Compression
JavaScript5469116121.23%
HTML409085720.95%
CSS284949517.37%
ICO54137769.69%
Generic Text451388.43%
RSS42715636.53%
EOT1808748.33%
SVG13010076.92%
Atom602846.67%
TTF422559.52%
BMP171270.59%
OTF1010100.00%
Generic Bin7114.29%

(Generic Binary and Generic Text are responses whose format Zoompf could not determine. This typically indicates an incorrect MIME type, and we could not conclusively determine a type by examining the first 500 characters or so. Manually spot checking revealed that most of these responses were JavaScript files served with an incorrect MIME type and which had a mix of HTML tags in them which confused our scanner. For purposes of analysis, I will ignore these are not count them as any other file type.)

This table tells us lots of interesting things about how content is compressed by the Alexa Top 1,000:

  1. Approximately 20% of all HTML, JavaScript and CSS files are served without compression. Major websites are having real problems compressing even the most basic and common types of compressible content.
  2. JavaScript is the single largest source of compressible content, yet it is served compressed less often than CSS or HTML. I believe this is due to the widespread use of 3rd party libraries and widgets which are served from a website you don’t control. While people can configure their own sites to compress content, 3rd parties serving their JavaScript files appear to be using compression less often than other sites. Based on their URLs, the majority of JavaScript resources missing compression appear to be for analytics scripts.
  3. Lesser known compressible file types are forgotten far more often than HTML, JavaScript, or CSS. While there are fewer instances of these formats on the web, they are more than twice as likely to be uncompressed.
  4. Atom feeds are not very popular.
  5. Someone is using BMP images on an Alexa Top Website? Wow. At least some of them are being served with compression.
  6. SVG images are present largely as fallback for web fonts.

Another interesting area are 404s. 404s are often overlooked because, while the request might be for a file like logo.png, the response is HTML. If your server is not configured properly, it will see the file extension .png and not apply compression, even the response is compressible text.Of the 1513 response which had a 404 status code, 490 of them were not compressed. In other words, 32.4% of all 404 handlers in the Alexa Top 1,000 are not using HTTP compression properly.

As I wrote about before, the use of Content-Type: deflate is incredibly problematic and broken. Luckily I did not see it in wide use in the Alexa Top 1,000. Of the 8825 response I saw which were using HTTP compression, only 23 were using DEFLATE. What is troubling is that, based on the HTTP response headers, virtually all of these Content-Type: deflate responses came from a Juniper Network’s DX series load balancer or web accelerator. A company like Juniper should know better.

See For Yourself

I’m a big believer in transparency, so here are my raw data files so you can review it yourself. In fact, if you work at an Alexa Top 1,000 site, there is even a text file full of exactly which URLs are missing compression!

What Does This All Mean?

This data supports much of what I discussed in the Lose The Wait: HTTP Compression post. Specifically:

  • HTTP compression, though easy in theory, is not properly implemented in practice. The majority of Alexa Top 1,000 websites are not completely implementing HTTP compression.
  • The most commonly compressed content, (HTML, CSS, and JavaScript) are not properly compressed 20% of the time. This is most likely due to incorrect configuring the web server to use HTTP compression based on incorrect or missing file extensions, and incorrect or missing MIME types.
  • Less common text formats, like RSS and XML, are more than twice as likely to be served uncompressed. People are forgetting about these files, and common configuration examples on the web exclude them.
  • Non-natively compressed files formats, such as ICO, SVG, and various font files are more than twice as likely to be served without HTTP compression. People are forgetting about these files, and common configuration examples on the web exclude them.
  • Nearly 1/3 of all 404s handlers do not use HTTP compression. This figure is nearly 50% higher than the 20% of regular, non-404 HTML files which are served without HTTP compression. This is most likely caused by web servers configured to use the requested URL’s file extension to decide if HTTP compression should be used for the response.

As I wrote about before, the use of Content-Type: deflate is incredibly problematic and broken. Luckily I did not see it in wide use in the Alexa Top 1000. Of the 8825 response I saw which were using HTTP compression, only 23 were using DEFLATE. What is troubling is that, based on the HTTP response headers, virtually all of these Content-Type: deflate responses came from a Juniper Network’s DX series of load balancers and web accelerators. A company like Juniper should know better.

This data also reinforces a great quote from Mike Belshe, one of the creators of SPDY, about optional features:

“Experience shows that if we make features optional, we lose them altogether due to implementations that don’t implement them, bugs in implementations, and bugs in the design.” – Mike Belshe

Compression was an optional after thought for HTTP, and so 20 years later we still have problems using HTTP compression appropriately.

The Million Dollar Question

What was startling when I really dug into the numbers was who seemed to be having the most trouble, and how widespread it was. There are a number of very large websites with obviously capable staffs were not compressing the majority of their content. For example, major news sites like The Washington Post, ABC News, The New York Post, CNBC, Sky in the UK, and NPR all served 80%-90% of compressible resources without compression. Even the Pirate Bay has 52 RSS feeds referenced on its main page using <link> tags, and none of them are compressed. The Japanese Social networking site Ameblo has an RSS feed of user activity that’s over 4 Megs.

So why is this happening? The guys who run The Pirate Bay are technical geniuses. The IT departments and budgets for ABC or The Washington Post are huge. How is it that the biggest and most popular websites in the world, who have the most to gain and the most to lose from web performance, and who are the best equipped, staffed, and funded to solve these problems, can’t seem to solve these problems?

This is the million dollar question. And it is literally the million dollar question, because it’s costing these websites millions of dollars.

How is it that the biggest and most popular websites in the world, who have the most to gain and the most to lose from web performance, and who are the best equipped, staffed, and funded to solve these problems, can’t seem to solve these problems?

The short answer is “They don’t know they have a problem.” I know it’s true because whenever I tell one of these websites about the problem, their immediate answer is “we aren’t? Crap we should. Let’s fix that”. And then they do.

Frankly, this reaction opens up an entirely different and very scary box. Because “They don’t know they have a problem” is just another way of saying “testing for front-end performance issues is very immature even at the largest organization.” That is an incredibly huge problem for our industry with many different facets. Its too big of a topic to stick at the tail end of this post, so I’ll put all my thoughts on this in a future blog post.

More Testing

More testing is clearly needed. The results for font files are new and interesting to me. While I know fonts like WOFF are natively compressed, it is interesting that OTF and others can be compressed using HTTP compression. This indicates no native compression, or a poor compression scheme so much so that a second pass of deflate makes it smaller. Also, the use of HTTP compression should be something that is charted over time, to see if we are improving. My last figure, from 2010, showed 78% of Alexa Top 1000 sites having at least 1 resource served without compression by the HTTP. Perhaps this should be added to the HTTP Archive.

If you want to find out whether your site is properly applying HTTP compression, Zoompf offers a free performance scan of you website. HTTP Compression, and the various implementing issues surrounding it, are just a few of the nearly 400 performance issues Zoompf detects when testing your web applications. You can also take look at our Zoompf WPO product.

Lose the Wait: HTTP Compression

Posted: February 10, 2012 at 4:25 pm

One of the ways you can improve website performance is to reduce the amount of data that needs to get delivered to the client. An easy way to reduce the amount of data sent to a client is to compress the content and then transfer it to the client. This can be done with HTTP compression. Despite being a surprising simply feature of HTTP, there are numerous challenges which must be addressed to properly use HTTP compression. These challenges are:

  1. Ensuring you are only compressing compressible content.
  2. Ensuring you are not wasting resources trying to compress uncompressible content.
  3. Selecting the correct compression scheme for your visitors.
  4. Configuring the web server properly so compressed content is sent to capable clients.

In this post, part of our Lose the Wait performance series, I will discuss each of these issues and demonstrate how to configure your web server to implement HTTP compression properly.

Compressing Compressible Things

Let’s start out easy. What should HTTP compression get applied to? The answer is simple: Any content which is not already natively compressed.

Notice I didn’t say "text resources." Text resources, like HTML, CSS, and JavaScript certainly should be compressed because they are not natively compressed file formats. Unfortunately, most people seem to focus on these 3 types of files. In fact, a quick web search shows that most of the top results for ".htaccess compress" include instructions only on compressing HTML, CSS, and JavaScript files. This just reinforces what I’ve said before; you have to be careful where your advice comes from.

Here is a list of common text resource types on the web which should be served with HTTP compression:

  • XML. XML is structured text used in standalone files (like Flash’s crossdomain.xml or Google’s sitemap.xml) or as a data format wrapper for API calls.
  • JSON. JSON is a subset of JavaScript used as a data format wrapper for API calls.
  • News feeds. Both RSS and Atom feeds are XML documents.
  • HTML Components (HTC). HTC files are a proprietary Internet Explorer feature which package markup, style, and code information used for CSS behaviors. HTC files are often used by polyfills such as Pie or iepngfix.htc to fix various problems with IE or to back port modern functionality.
  • Plain Text. Plain text files can come in many forms, from README and LICENSE files, to Markdown files. All should be compressed.
  • Robots.txt. Robots.txt is a specific text file used to tell search engines what parts of the website to crawl. Robots.txt is often forgotten since it is not usually accessed by humans and does not appear in JavaScript-based web analytics logs. Since robots.txt is repeatedly accessed by search engine crawlers and can be quite large, it can consume large amounts of bandwidth without your knowledge.

ICO

As I said, HTTP compression isn’t just for text resources and should be applied to all non-natively compressed file formats. What do I mean by this?

As an example, let’s look at ICO files. ICO files are an image format used originally used for icon images on Windows. The format, as it is in use today, was created over 20 years ago for Windows 3.0. Today, ICO files are used on the web as Favicons for a website, usually displayed in the address bar or browser tab. While modern browsers allow other file formats besides ICO support is not universal. Many sites continue to use ICO files as Favicons for compatibility reasons.

Despite being an image, ICO files are not natively compressed. ICO images are actually a primitive version of a BMP image. Neither ICO nor BMP image formats are natively compressed. While can (and should) avoid using BMP images on your website, you can’t do this with ICO files. Be sure to configure your web server to server ICO images with HTTP compression.

SVG

SVG images are example of an image format which is not natively compressed. SVG images are just XML documents, but they have a different MIME type and file extension. This means, while someone might remember to compress XML documents, they forget to compress SVG documents.

You might be using SVG images on your website and not even know it. This is because of a feature of SVG images, SVG fonts, which allow SVG files to contain font glyphs used to render text. These SVG image-that-really-a-font files can be references in CSS using the @font-face syntax much like a OTF or WOFF font file. Divya Manian has written a comprehensive post about the pros and cons of SVG fonts. For the purposes of this discussion the main take-away from her post is that, until iOS 5, SVG fonts were the only type of custom font supported by iPhone, iPad, and iPod Touch.

Font support is, to put it nicely, a giant mess. Font libraries abstract this away from the web developer and serve the correct format, including SVG fonts, to the correct browser. This mean your website can be using SVG without you even knowing it. Remember to serve your SVG files using HTTP compression.

Compressing already compressed content

Another mistake developers make with HTTP compression is using it on content that is already natively compressed. Apply compression to something that is already compressed doesn’t help improve performance. In fact, it can hurt performance to two ways.

First, HTTP compression has a cost. The web server has to take the content, compress it, and then send it to the client. If the content cannot be compressed further, you are just wasting CPU doing a meaningless task.

Secondly, applying HTTP compression to something that’s already compressed doesn’t make it smaller. In fact, the overhead of adding headers, compression dictionaries, and checksums to response body actually makes it bigger, as shown in the figure below:

Do websites actually do this? Yes, and it’s more common than you would think. I used Zoompf WPO to examine Fox News. Fox News is the 40th most visited website in the United States. As you can see, Fox News is mistakenly applying HTTP compression to PNG images.

This not only wastes CPU, but also increases the size of the PNG images delivered to Fox News visitors by a few dozen bytes:

Zoompf actually has two different checks for this issue. The first check "Compressed Content served with HTTP compression" alerts you that you are wasting CPU time compressing something that is already compressed. The second check, "Bigger with HTTP Compression" identifies content that is actually larger when served using HTTP compression.

Both of these problems usually are the result of a configuration problem with the web server or an inline network device. Something in your environment is applying HTTP compression to all outbound content instead of only content that should be compressed.

GZIP Vs. DEFLATE

So far, we have talked about HTTP compression as if it is an opaque or atomic feature. But that is not the case. HTTP simply defines a mechanism for a web client and web server to agree a compression scheme can be used to transmit content. This is accomplished using the Accept-Encoding and Content-Encoding headers. There are two commonly used HTTP compression schemes on the web today: DEFLATE, and GZIP.

DEFLATE is a patent-free compression algorithm for lossless data compression. There are numerous open source implementations of the algorithm. The standard implementation library most people use is zlib. The zlib library provides functions for compressing and decompressing data using DEFLATE/INFLATE. The zlib library also provides a data format, confusingly named zlib, which wraps DEFLATE compressed data with a header and a checksum.

GZIP is another compression library which compresses data using DEFLATE. In fact, most implementations of GZIP actually uses the zlib library internal to conduct DEFLATE/INFLATE compression operations. GZIP produces its own data format, confusingly named GZIP, which wraps DEFLATE compressed data with a header and a checksum.

Unfortunately, the HTTP/1.1 RFC does a poor job when describing the allowable compression schemes for the Accept-Encoding and Content-Encoding headers. It defines Content-Encoding: gzip to mean that the response body is composed of the GZIP data format (GZIP headers, deflated data, and a checksum). It also defines Content-Encoding: deflate but, despite its name, this does not mean the response body is a raw block of DEFLATE compressed data. According to RFC-2616, Content-Encoding: deflate means the response body is:

[the] "zlib" format defined in RFC 1950 [31] in combination with the "deflate" compression mechanism described in RFC 1951 [29].

So, DEFLATE, and Content-Encoding: deflate, actually means the response body is composed of the zlib format (zlib header, deflated data, and a checksum).

This "deflate the identifier doesn’t mean raw DEFLATE compressed data" idea was rather confusing. Early versions of Microsoft’s IIS web server was programmed to return raw DEFLATE compressed data for Accept-Encoding: deflate requests instead of a zlib formatted response. And naturally versions of Internet Explorer at the time expected responses with a Content-Encoding: deflate header to have raw DEFLATE response bodies.

As Mark Adler, one of the authors of zlib, explains in this StackOver thread:

However early Microsoft servers would incorrectly deliver raw deflate for "Deflate" (i.e. just RFC 1951 data without the zlib RFC 1950 wrapper). This caused problems, browsers had to try it both ways, and in the end it was simply more reliable to only use GZIP.

As Mark says, browsers receive Content-Encoding: deflate had to handle two possible situations: the response body is raw DEFLATE data, or the response body is zlib wrapped DEFLATE. So, how well do modern browser handle raw DEFLATE or zlib wrapped DEFLATE responses? Verve Studios put together a test suite and tested a huge number of browsers. The results are not good.

All those fractional results in the table means the browser handled raw-DEFLATE or zlib-wrapped-DEFLATE inconsistently, which is really another way of saying "It’s broken and doesn’t work reliably." This seems to be a tricky bug that browser creators keep re-introducing into their products. Safari 5.0.2? No problem. Safari 5.0.3? Complete failure. Safari 5.0.4? No problem. Safari 5.0.5? Inconsistent and broken.

Sending raw DEFLATE data is just not a good idea. As Mark says "[it's] simply more reliable to only use GZIP."

It should be also noted that all browsers that support DEFLATE also support GZIP, but all browser that support GZIP do not support DEFLATE. Some browsers, such as Android, don’t include deflate in their Accept-Encoding request header. Since you are going to have to configure your web server to use GZIP anyway, you might as well avoid the whole mess with Content-Encoding: deflate.

Luckily, avoiding DEFLATE isn’t all that difficult.

The Apache module which handles all HTTP compression is mod_deflate. Despite its name, mod_deflate don’t not support deflate at all. It’s impossible to get a stock version of Apache 2 to send either raw DEFLATE or zlib wrapped DEFLATE. Nginx, like Apache, does not support deflate at all. It will only send GZIP compressed responses. Sending an Accept-Encoding: deflate request header will result in an uncompressed response.

Microsoft’s IIS web server can send both gzip and deflate responses and you can enabled or disable each scheme individually. For IIS6, you can , you can edit the metabase to disable DEFLATE support. For IIS7, you can disable DEFLATE support by editing the DEFLATE compression scheme section in the <schemes> element of the <httpCompression> element of the various IIS7 .config files.

Both Zoompf’s free and commercial products have a check built-in, “Obsolete Compression Format”, which will detect if your web server is sending content compressed with DEFLATE.

Netscape 4 and Internet Explorer 6 Are Screwing You. Again.

So by now you should have your web server configured to:

  1. Properly compress what needs to be compressed.
  2. Avoid compressing already compressed content.
  3. Configured to only use GZIP.

Now you need to ensure that your configuration is not actually excluding perfectly capable browsers.

While HTTP compression is a mature feature today, there were some problems early on. Netscape 4 only supported HTTP compression for HTML documents even though it sent an Accept-Encoding: deflate, gzip for all requests. Serving it HTTP compressed CSS or JS documents would make it crash. For reasons that aren’t quite clear, the developers of Apache decided to address this client-side bug with a server-side fix. They added the following seemingly harmless line into the Apache configuration file:

BrowserMatch ^Mozilla/4 GZIP-only-text/html

Any browser calling itself Mozilla/4 would only receive HTTP compressed HTML files. Since Apache was and is the most popular web server on the Internet, this caused enormous problems which still affect us today.

First of all, this was the middle of the browser wars and Internet Explorer 4, Internet Explorer 5 and even Internet Explorer 6 all identified themselves as Mozilla/4 in their User-Agent strings. But these browsers could accept HTTP compression for non-HTML responses. Trying to patch around one buggy browser caused another to be slow! Since IE6 would ultimately achieve over 95% market share, it was a problem that IE6 would download webpages more slowly from Apache than from other web servers. To resolve this, the Apache developers were forced to add another configuration directive:

BrowserMatch \bMSI[E] !no-GZIP !GZIP-only-text/html

This line means: if the User-Agent has MSIE in it, then turn off the no-GZIP and GZIP-only-text/html options, thereby instructing Apache to use HTTP compression for all responses if IE asked for it. And all was good, until it wasn’t.

You see, IE6 on Windows XP also multiple problems with HTTP compression. Most of these issues dealt with compressed CSS or JavaScript files being cached as compressed items and which were then read from the cache assuming they were not HTTP compressed. So again another Mozilla/4 browser had problems with compression, and so again the Apache developers had to "fix" the issue with another configuration directive:

BrowserMatch \bMSIE\s6 GZIP-only-text/html

This directive instructed the web server to only send compressed content for HTML responses if the browser was IE6. While this helps dealt with the majority of the issues, some of these bugs caused so many extreme edge-case problems that, for reliability reasons, larger sites would completely disable HTTP compression for IE6 entirely:

BrowserMatch \bMSIE\s6 no-GZIP

Eventually Microsoft fixed these issues with hot fixes and, comprehensively, with Windows XP Service Pack 2. But this created a fragmentation problem, where some IE6 browsers could handle HTTP compression for all content, and some could not. Another rule was added in an attempt to serve compressed content to IE6 browsers that had SP2 installed. This was done by looking for the poorly named SV1 identifier in IE6′s User-Agent string:

BrowserMatch "^Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1" !no-GZIP !GZIP-only-text/html

This chain of "deny this, but not this, unless it’s this, but not if it is also this" directives made configuring a web server to properly serve compressed documents to the appropriate browsers difficult and prone to error. Since these bug/solution cycles happened numerous times over several years these configuration directives mutated. Blog posts from 2004 would tell you to do one thing and blog posts from 2006 would say another. Much like a child’s game of telephone short comings, errors, missing edge cases, and missing corner cases were magnified as people reused old configuration files and shared the "correct" advice. Even today, many of the top Google search results for configuring HTTP compression for Apache using mod_deflate contain different and incorrect directives.

As I wrote in Advice on Trusting Advice it all comes down to where you get your advice from. Follow the advice on this top search result and IE9+ gets no compression at all. Follow the advice on this top search result and IE6 gets no compression at all. Follow the advice from this search result and no version of IE will get anything using HTTP compression, except for IE7. Follow advice from IBM, no version of IE will ever get a non-HTML file using HTTP compression.

Depending on which directives were used, and how match criteria is configured, you ended up with several possible scenarios:

  • HTTP compression is completely disabled for all Mozilla/4 browsers.
  • HTTP compression is completely disabled for IE6
  • HTTP compression is completely disabled for IE6 except SV1
  • HTTP compression is completely disabled for all versions of IE
  • HTTP compression is completely disabled but all versions of IE, except IE6 (so no compression for IE > 6)
  • HTTP compression for non-HTML files is disabled for all Mozilla/4 browsers.
  • HTTP compression for non-HTML files is disabled for IE6
  • HTTP compression for non-HTML files is disabled for IE6 except SV1
  • HTTP compression for non-HTML files is disabled for all versions of IE
  • HTTP compression for non-HTML files is disabled but all versions of IE, except IE6 (so no compression for IE > 6)

Apache makes it quite easy to mess this up. Nginx is much easier. It completely ignores the old Netscape 4 browsers and does not attempt to work around them. It also has a very simply mechanism to avoid sending compressed content to bad versions of IE6. You don’t need to manually define "this is good" and "this is bad" regexs, allows you to avoid making a mistake.

In practice, you should just not even try to work around these problematic browsers. The problem browser have all been updated or patched. Even the most recent of the affected browsers, IE6, was fixed nearly a decade ago. Even on platforms that are no longer supported, this issue has been fixed. You should review you configuration file and remove any browser filtering code used for HTTP compression.

Hopefully this section has also taught you that fixing a client-side bug with a server-side fix it rarely a good or sustainable idea. As I discussed in The Big Performance Improvement in IE9 No One is Talking About, this approach of using the User-Agent as a factor in content generation forced the widespread use of the Vary: User-Agent header. The Vary header used in this manner effectively nullifies the shared caching which reduces the overall performance of the web.

Extension Vs. MIME Type

It is important to review how your web server is configured to compress content. Most browsers allow you to specify either a list of file extensions to compress, or a list of MIME types to compress, or both. Be careful to review this list.

Let’s say you have configured your application to serve text/javascript responses using compression. Are you sure that’s the only MIME type you application uses when serving for JavaScript files? What about text/x-javascript or application/x-javascript or application/javascript? What MIME type does your API serve for JSON responses? text\json? application\json? Something else? How about HTML? Are all of your HTML files using text/html? Do you have some sections from the XHTML days which use other MIME types like application/xhtml+xml or text\xhtml or application\xhtml? Is all of the markup generated by your application served using a single and consistent MIME type? And let’s not forget about the code you didn’t write. What MIME type does that opaque charting library use to send data to the client? Or that auto-completing textbox widget you got from Github?

If you are configuring the web server to use compression using file extensions, did you get all of them? .htm or .html or is it something else? What about your 404 handler? A request happens for the non-existent file /foo/bar.jpg. Since the file extension is not explicitly defined as something that should be compressed (or, being an image, is explicitly defined not to be compressed), the 404 response isn’t sent with compression.

Care must be taken when configuring your web server to ensure that uncompressed content is not slipping through due to a missing file extension or MIME type declaration.

Properly Configuring HTTP Compression

So, given all these challenges, how should you go about configuring HTTP compression properly?

To see where you might have made a mistake configuring your server, your need a something to compare it to. I am a big fan of the .htaccess file from the HTML5 Boilerplate Project. This is an Apache configuration file specifically crafted for web performance optimizations. It provides a great starting point for implementing HTTP compression properly. It also serves as a nice guide to compare to an existing web server configuration to verify you are following best practices. At the very least, the HTML5 Boilerplate .htaccess file provides a comprehensive list of common web content which should or should not get served using HTTP compression.

Getting a good starting point is only half the battle. The configuration for HTTP compression on a web server only works when it matches the application running on that server. Even the HTML5 Boilerplate configuration file can fail you if there is a discrepancy between the file extensions and MIME types in the configuration file and those used by your application. It’s easy to forget or overlook a MIME type or a file extension that you application uses. To ensure your application matches your configuration, the best thing to do is carefully review:

  1. How is your web server configured to map MIME types to content or file extensions?
  2. How is your web server configured to compress content relative to those MIME types or extensions?
  3. How are your application’s filenames and extensions structured?
  4. How does your application change or override a response’s MIME type?
  5. What third party libraries use MIME types?

Once you think you have properly configured the web server, you need to validate it. Web Sniffer is a great, free, web-based tool that let you make individual HTTP requests and see the responses. Web Sniffer gives you some control over the User-Agent and Accept-Encoding header to ensure that compressed content is delivered properly. Hurl is another web-based HTTP tool you can use. It allows for more control than Web Sniffer, but requires you to manually enter more information to get the same results:

Hurl and Web Sniffer only test a single page at a time. You can use Zoompf’s free scan and Zoompf WPO can be used to scan multiple pages to verify no uncompressed content is slipping through.

Conclusions

As this post shows, there are many challenges which must be overcome to properly configure HTTP compression. Make sure all non-natively compressed content is served using HTTP compression. Don’t waste load time, CPU cycles, and bandwidth compressing content that is already compressed. Only use GZIP compression to ensure compatibility. Don’t try to work around old browsers since it is easy to make a mistake and end up not delivering compressed content to a capable browser. Review your application code and server configuration to make sure the application’s content and structure matches your HTTP compression settings. Don’t forget about compressing 404′s. Finally, don’t just assume your configuration works. Use a tool to validate that is works.

Want to see what performance problems your website has? Content Served Without Compression, Compressed Content Served with Compression, Bigger With Compression, and Obsolete Compression Format are just 4 of the nearly 400 performance issues Zoompf detects when testing your web applications. You can get a free performance scan of you website now and at a look at our Zoompf WPO product at Zoompf.com today!

REDbot: Awesome HTTP Testing

Posted: December 22, 2011 at 8:43 pm

I am always on the lookout for new and cool web performance and quality tools. One of my favorite tools is REDbot. Every web performance advocate should be using REDbot regularly. Want to know why?

To start, REDbot was created by the awesome Australian Mark Nottingham. Mark writes some excellent technical-yet-easy-to-understand essays on the inner workings of HTTP, such as the definitive Caching tutorial for Web authors and web masters. With REDbot, Mark has taken his vast knowledge of all things HTTP and distilled that into a wonderful tool written in Python. So, Mark’s brain is Reason #1

Reason #2 is what the tool does. The best way to describe REDbot might be "HTTP Lint", which, funny enough, was the name of the first C# project of what became the Zoompf scanner. REDbot examines a server’s HTTP response headers and body for performance issues, quality issues, compatibility issues, adherence to the HTTP RFCs, and provides various ancillary info messages. Frankly the depth of issues it looks for is really quite amazing; currently over 150 different items can be detected and reported by REDbot.

screen of some of the issues that REDbot finds

As an example, here are just some the things that REDbot checks for with the Last-Modified header.

  • Is the date format valid? Invalid dates can’t get conditionally cached.
  • Is the date in the future? Resources in the future cannot be cached.
  • Does the web server correctly return a 304 if the resource has not been modified?
  • Are there duplicate Last-Modified headers? Do they have different values?

That is just scratching the surface of what REDbot can find out about your site. Is the Content-Length header right? Is chunked encoding working properly? How many inline caching proxies did the response go through? REDbot has helped me find and fix several issues with Zoompf’s own web infrastructure. In fact, many of the issues REDbot looks for were so helpful, we added them to list of issues that Zoompf tests for. While Zoompf does not include all of REDbot’s tests I don’t know of any other performance tools which look for these kinds of HTTP issues. For the reason of completeness alone, REDbot needs to be part of you toolset.

Of course, detecting some HTTP problems can get pretty involved. For example, REDbot and Zoompf will verify that a server properly responds to If-Not-Modified, If-None-Match, and Range requests. Additional REDbot and Zoompf can confirm that Vary and Accept-Encoding response headers are correctly operating. All of this involves sending multiple requests to the web server for each issue. Testing a single static resource can involve 5 to 6 requests and processing that many responses! While this isn’t so bad when testing a single URL, doing for multiple resources is time consuming. The public web instance of REDbot often times out when using the "check assets" feature to test multiple URLs at once. Zoompf website scans can take 2 times or 4 times longer to complete if you conduct these extended HTTP tests. We are playing with a few ways to be intelligent about when we send these extra test requests but it is still a work in progress. In the meantime, this extended HTTP testing capability is disabled by default for our customers and completely unavailable for free scans.

Reason #3: REDbot is open source and hosted on GitHub which makes it super easy to start using. It can run from the command line, but Mark “the Awesome Aussie” Nottingham (it’s his pro wrestling name, look it up) has setup a public instance of REDbot with a web interface where anyone can quickly test a resource. If you test an HTML page, you can click the "check assets" link underneath the response headers to recursively test all the referenced resources. This is a handy feature to rapidly test multiple URLs but as we said you will occasionally get timeouts.

Reason #4: It’s web UI is gorgeous. As someone who makes incredibly crappy web interfaces (and which I am convinced would somehow be better if I wrote them on a new Macbook Air), I drool over what Mark has done. Fades, transparency, context dialogs, this thing is sexy looking.

REDbot is an awesome tool which provides much needed HTTP insight and validation available nowhere else. I highly recommend it to anyone interested in the working of the web and I thank Mark Nottingham for his excellent contribution to our community.

The Big Performance Improvement in IE9 No One is Talking About

Posted: March 30, 2010 at 6:34 pm

Microsoft released a preview of Internet Explorer 9 last week. Much attention has been paid to its increased standards support as well as performance improvements such as GPU accelerated page rendering and a faster JavaScript engine. However a small blog post that has received virtually no commentary discusses a change with IE9 that may well be the biggest web performance improvement we will see with the new browser.

Internet Explorer Logo

In a post on the IE blog last week, Marc Silbey shares the structure of IE9’s new User-Agent string. In it he writes:

An important change for site developers to know is that IE9 will send the short UA string by default. This change improves overall performance, interoperability and compatibility. IE9 will no longer send additions to the UA string made by other software installed on the machine such as .NET and many others.

Specifically, IE9’s the User-Agent string will look like this: Mozilla/5.0 (compatible, MSIE 9.0; Windows NT 6.1; Trident/5.0. Contrast that my IE8 User-Agent which is: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; MDDC; InfoPath.2; .NET CLR 1.1.4322; .NET CLR 3.0.30729). (I have no idea why there is a “Media Center PC 5.0″ identifier in my User-Agent string. I have a Dell laptop using Vista Home). The obvious benefit of the IE9’s User Agent string is that it is shorter. In fact, it’s over 70% smaller (63 bytes opposed to 216 bytes). Why the difference?

The difference is all the superfluous junk on the end of the User-Agent string. From IE4 until IE8 other programs installed on your machine could append an identifier string about themselves to the User Agent string IE would send. Junk such as different .NET runtimes and version numbers, toolbars and browser add-ons, media center strings, tablet string, Office plug-ins, and more would all appear in the User-Agent. Yes, moving to a shorter User Agent string will reduce the amount of bytes IE9 must send with each requests will improve performance. We’ve known about the benefits of reducing an HTTP request to try and make it fit in a single packet. In fact Yahoo has a performance rule around excessively sized cookies.

Of course, there is no way we would write an entire blog post about how the User-Agent string for IE9 is only 63 bytes and try to claim with a straight face that the reduced request size is the most impactful web performance improvement that IE9 brings to the table. You will see that the real benefit of IE9’s shorter User-Agent string, and perhaps the biggest web performance optimization in all of IE9, has to do with HTTP compression and caching.

Caching Compressed Responses

HTTP compression is one of the most impactful performance improvements you can deploy for a web application resulting in bandwidth savings of 50-80% for text resources. However compression makes things more complicated for shared caches like caching proxies. Consider what happens if a browser that supports compression requests a URL and the caching proxy receives a compressed version of the response. Then the caching proxy receives a request for the same URL from a different user. Can it send the compressed version? To solve this, HTTP/1.1 added the Vary header. The Vary header allows web servers to essentially say “To generate this response, we used the URL and the value of the following HTTP request headers.” Caching proxies could then store a copy of a response and know that “This copy is for this URL when requested with this HTTP request header value.” You can think of a caching proxy as having an internal table of what resources are cached and under what conditions to serve the resource. If the incoming request URL matches the URL for a cached resource, and if the value of each incoming HTTP header that was specified by the Vary response header match the values of the HTTP headers used in the original request for the cached resource, then the cache can service the new request locally by returning the cached resource.

This method allows the caching proxy to properly handle both compressed and uncompressed versions of a resource. This means that caching proxies could store two different copies of a distinct URL (one compressed, one uncompressed) based on the Accept-Encoding header. Unfortunately there are some bugs in certain older web browsers where these browsers would send an Accept-Encoding header telling the web server they supported HTTP compression when they really didn’t. The biggest culprit was IE6 before SP1. This browser had a bug would it would internally cache the compressed version of a CSS or JavaScript response and would try (and fail) to directly process the compress bytes. There are also certain versions of Netscape 4.x that did not properly handle HTTP compression despite using Accept-Encoding to instruct the server that it did.

What was the solution? Web servers would first look for the presence of an Accept-Encoding header to determine if the browser making the request thinks it can accepted compressed resources. If so the web server would next examine the User-Agent header to determine if see the requesting browser is known to have HTTP compression bugs. Only if the browser says it can accept compressed responses and it is not one of the browsers that has compression bugs will the web server serve a compressed version of the resource. (In retrospect, modifying a perfectly working web server to work around a buggy web browser seems silly. However in the age before automatic updating software this was a reasonable approach).

Unfortunately, this seemingly small change has enormous ramifications.

Different Strokes For Different Folks

Since the web server is varying the response based on both the Accept-Encoding header and the User-Agent header, it needs to indicate this to any downstream caching proxies by using a Vary: Accept-Encoding, User-Agent header. Since the web server could potentially serve a different response based on the User-Agent string, the caching proxy cannot serve the same cached response to web browser’s with different user agent strings. So instead of matching the URL and the value of the Accept-Encoding header on incoming request, now the caching proxy must match the URL, the value of the Accept-Encoding header, and the value of the User-Agent header to be able to return a cached response. To understand the impact of change, consider the following scenario.

Tom and Nick work at the same company and their web traffic passes through a shared caching proxy. Tom is using Apple’s Safari web browser and Nick is using Mozilla’s Firefox web browser. Both of these browsers support HTTP compression and neither have any known compression bugs. Tom visits http://example.com/index.html. The web server returns a compressed response with Cache-Control and Expires headers to enable caching of the resource, and a Vary: Accept-Encoding, User-Agent header to let downstream caches know what values were used to determine the response. The caching proxy stores a copy of the response using the URL http://example.com/index.html and the Accept-Encoding request header value of gzip as the key.

Nick then visits http://examples.com/index.html. The caching proxy sees the request and attempts to service it. The caching proxy does have a cached copy for the URL Nick is requesting and Nick’s request also has the same Accept-Encoding header value as the cached copy. However Nick’s request was made with Safari’s User-Agent string and the cached copy was stored for a request with Firefox’s User-Agent string (due to the Vary: User-Agent value from the originating web server). So even though both web browsers support compression, neither have any caching or compression bugs, and the caching proxy already has a cached response that is perfectly usable by any modern web browser, the caching proxy cannot serve Nick the cached response. Instead Nick’s request is passed on to the web server and that response is also cached. The shared cache now contains two separately cached yet identical responses.

The problem gets worse. Tim, who works at the same company as Tom and Nick, uses Google’s Chrome web browser to request http://example.com/index.html the caching proxy again is not able to locally service the cache. This is because Chrome has a different User-Agent string from Firefox or Safari. Thus the shared cache ends up with 3 separately cached yet identical responses. Caching Proxies went from have to story 2 copies of each distinct URL (one compressed, one uncompressed) based on the Accept-Encoding header, to storing 2 * X copies of a distinct URL, where X is the number of distinct User-Agents.

500 Different Ways To Say The Same Thing

It’s clear from this example that the addition of Vary: User-Agent to a response can significantly reduce the performance of shared caching. How much is dependent on how many distinct User agent strings. Obviously different web browsers will have different user agent strings and there is nothing we can do about that. But what about different User-Agent strings for the same version of the same browser? Sadly, there can be multiple different User-Agent strings for the same version of the same browser. Often this occurs when the operating system is included in the User-Agent string. This means the same version of the same browser can often have half a dozen different user agent strings. While this exasperates the Vary: User-Agent shared caching problem, it is still manageable.

Unfortunately Internet Explorer proceeds to destroy any chance of managing the problem. This is because IE does not have a half dozen or so different User-Agent strings. It has hundreds! Here is a list of over 480 different IE8 User-Agent strings. All of those browsers are Internet Explorer 8 but each has a different User-Agent string due to of all those external programs, toolbars, and browser add-ons that append on junk.

In short, due to the hundreds of different User-Agent variations for the same fundamental versions of Internet Explorer, and the (shrinking) majority market share of IE, currently the use of a Vary: User-Agent HTTP header effectively nullifies shared caching.

Starting To Fix The Problem

IE9’s adoption of a shorter User-Agent string without any inclusion of 3rd party identification banners will significant help matters. This change brings IE9’s User-Agent string in line with other web browser User-Agent strings which only have a few variable components. These variables are:

  • Computer Architecture of the computer
  • Version number of the web browser
  • Operating System of the computer
  • Language of Operating System

Computer Architecture is largely stabilizing on a few distinct values for each browser vendor such as i686 or x64. A changing version number of the web browser, include minor build numbers or patch numbers, is only included by a few web browsers (most notably Google Chrome). This can make for a large number of different User-Agent strings for the same major browser version. However automatic updates having largely marginalized this problem: the majority of user’s receive and use an updated version of a browser with 3 weeks of its release. The operating system of the computer also has only a few distinct values, except in the Linux world with different distributions tend to insert their name into the User-Agent string (Ubuntu and Debian being the worst offenders). Finally is the OS language. Ideally this should not be included in the User-Agent string at all, but in the Accept-Language header. The impact of including the language in the User-Agent string is largely a non issue as users behind a shared caching proxy (either instead an ISP or a corporation) tend to speak the same language.

The end result is IE9, and other major browsers, tend to only have 5 or 10 distinct User-Agent strings for each major version.

Why Even Use Vary: User-Agent?

This is perhaps the best question. There are no modern browsers that have HTTP compression issues. The biggest problem browser, IE6, had its HTTP compression issues solved nearly 8 years ago with the release of IE6 SP1 in spring of 2002. All major web browsers that send an Accept-Encoding header do legitimately support compression. There is simply no reason to ever include a Vary: User-Agent header when dealing with a cachable resource. However its easy to see why we have this problem. There are 10 years worth of web servers out there that are configured to inspect User-Agent strings. Just look at Apache’s documentation and you can see why. Below is the sample configuration for Apache’s compression module. It uses regular expressions on the User-Agent to detect bad browsers and then includes a Vary: User-Agent header in the recommended configuration for their compression module. This nullifies any caching of compressible and cachable resources.

<Location / >
  # Insert filter
  SetOutputFilter DEFLATE
  # Netscape 4.x has some problems...
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  # Netscape 4.06-4.08 have some more problems
  BrowserMatch ^Mozilla/4\.0[678] no-gzip
  # MSIE masquerades as Netscape, but it is fine
  # BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
  # Make sure proxies don't deliver the wrong content
  Header append Vary User-Agent env=!dont-vary
</Location>

There is one situation where it is appropriate to have a Vary: User-Agent response header. Consider a dynamic HTML page where the web server might return different content to a mobile browser than it would for a desktop browser when serving the same URL. For those pages you would want to include a Vary: User-Agent header. However these dynamic pages are, by definition, not cachable and thus don’t harm shared caches like Caching Proxies. The rule is any resource that is cachable should not have a Vary: User-Agent header.

The Biggest Performance Improvement for IE9?

With widespread adoption of IE9, we move from a world where the major browsers (IE, Firefox, Chrome, Safari, Opera) are represented by several thousand distinct User-Agent strings to a world of roughly 50-100 distinct User-Agent Strings. That is a 10 or 30 times improvement or over an order of magnitude in improvement. We expect that shared caches like caching proxies should get roughly a 10-20 times improvement in hit rate for resources from web servers returning a Vary: User-Agent header.

This is why we say that Microsoft’s decision to remove 3rd party program identifiers from IE9’s User-Agent string could be the most impactful web performance optimization present in IE9. Other performance improvements in IE9 like the improved JavaScript engine or GPU rendering certainly are impressive. However, even 100%, 200%, or even 1000% improvements to those system will result in microseconds of improvement. A clean User-Agent string which drastically increases the hit ratio of shared caches will have increase performance measured in milliseconds or even seconds, while decreasing bandwidth costs and server load for the origin web server.

Want to see what performance problems your website has? Cachable Resource with Vary:User-Agent is just one of the 300+ web performance issues Zoompf can detect when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today or try our free performance scanning bookmarklet.

Performance Tip for HTTP Downloads

Posted: March 24, 2010 at 6:14 pm

HTTPWatch has an interesting article today on their blog entitled “Four Tips for Setting up HTTP File Downloads.” They offer some great advice to make sure your downloadable files work across all browsers and are saved using the appropriate name. However they didn’t include a very important feature that all websites offering large file downloads should have: support for resumable downloads! As we will see this is an essential performance feature that improves user experience while reducing bandwidth costs.

Partial Downloads and the Range Header

HTTP/1.1 added many exciting features over HTTP/1.0. And while people are familiar with the more popular HTTP/1.1 performance enhancements such as default persistent connections or chunked encoding most are unaware of another performance enhancement: partial responses. HTTP/1.1 allows a client to request a certain piece of a resource. The client can use the Range header to tell the web server to serve only a subset of bytes from the total resource.

This seemingly odd and esoteric feature is actually quite powerful because it allows for browsers to resume HTTP downloads! Consider this scenario:

Diagram showing how a browser can use the Range header to resume an interrupted download

Here we see the browser is trying to download a large PDF file named “report.pdf” which is approximately 9 megabytes. The client issues an HTTP GET request and starts downloading the response. The web server used the Accept-Range header to indicated to the client that it allows for GET requests with the Range header to download pieces of this resource. After the client has downloaded a 2 megabytes, the browser experiences a problem. This could have been a caused by a number of issues. For example the computer might have momentary lost its network connection (common on wireless networks) or another program on the computer might have locked up and caused the browser’s connection to time out. Whatever the cause, the client had to close the HTTP connection it had with the web server. However, instead of having to redownload the entire PDF file, the client can use a range request to skip what it has already downloaded and only fetch the remaining data. The client does so by using a Range header to tell the web server to serve the contents of the PDF starting at the offset 2048000.

A few key points about resumable downloads and range requests:

  • Range requests allow the user to pause and resume the download inside their browser’s downloads window.
  • Resumable downloads can only work for static resources. They cannot be used with dynamically generated responses that change with each request or where you do not know the size of the resource ahead of time.
  • To indicate to the client that you allow range requests, you must send both a Content-Length response header and a Accept-Ranges response header.
  • IIS and Apache include the appropriate headers to support range requests by default when serving static files from the file system. These web servers will also automatically handle incoming Range header and serve the appropriate bytes.
  • If you are using PHP, ASP.NET, Ruby, or some other application logic to process the client’s request and deliver the downloaded file (as in /download.php?ID=123), you will need to manually add logic into your application code to handle range requests. This is not an easy task. Consider getting rid of your download handling application logic entirely and serve the file directly from the file system. This allows you to leverage the web server’s built in support for range requests.

Conclusions

Supporting resumable downloads is an easy way for you to save bandwidth while providing your users with a better browsing experience. You can view the HTTP headers for your downloadable resources to determine if you support resumable downloads using this service and making sure the “Show all server header fields” option is enabled. You can also use a browser plug-in like Live Headers to view the HTTP header coming from the server. Zoompf checks for this issue by looking for HTTP responses that have a Content-Disposition: attachment; header (indicating a downloadable file) but don’t have an Accept-Range header.

Want to see what performance problems your website has? Non-Resumable HTTP Download is just one of the 300+ web performance issues Zoompf detects when scanning your web application for performance issues. Get your instant free web performance assessment at Zoompf.com today!

Useless Duplicate Cookies

Posted: March 8, 2010 at 1:58 pm

In our last post where we described the 300 issues Zoompf checks your website for during its web performance asessment we said that the #1 way we discover new web performance issues is simply looking at web responses. This story is a perfect example of how that actually happens. Today (in fact, about 2 hours ago) we were helping a client optimize their site when we noticed a rather long HTTP Set-Cookie header. This is what we saw:

Now that is rather difficult to look at. So we cleaned up the code, trimmed out the expires and path information for each cookie declaration, and aligned each cookie name/value pair on its own line. This is the clean version:

Set-Cookie:
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],

As you can see, the web application is setting the cisession cookie 9 separate times! And every time it gets set to the very same value. Now each distinct cookie name can only have one value. The web browser will use the last declaration. So this response needlessly sets the cookie 8 times. The original Set-Cookie header’s value was 3681 bytes long. But when you remove the first 8 cisession cookie declarations and instead only have 1 cisession cookie that size is reduce to 409 bytes, a reduction of 89%.

Well that’s a nice find. But then things got worse. This site used rotating cookie values where the value of the cookie is changes on each and every page (this is often done in banking and e-commerce applications to mitigate session hijacking). In this case that meant every page generated by PHP hadthese 9 cookie declarations. By identifying and resolving this problem we helped the client take 3 kilobytes of every HTML response! Now that’s a really nice performance optimization!

Cause of the Issue

This client had an online store. To uniquely identify each visitor and provide them with a shopping cart the application code had to set a session identifier for the visitor. They had a single function which would verify the client had a session identifier and set the new appropriate value. This function was called 9 separate times in different parts of the code during page generation. However the function did not check to see if the session identifier had already been set for this cycle. It just appended on a new cookie declaration. So every time a page was generated, 9 cookie declarations would be added on to the HTTP response.

This issue was hard to detect. Since the browser only uses the last declaration, HTTP requests back to the server only contain 1 cookie, not 9. For the same reason if you use a browser add-on to examine the stored cookies you will only see 1 cookie and not 9. In fact, we had to modify Zoompf’s code to detect this. The System.Net classes in Microsoft .NET were automatically collapsing the 9 redundant cookies into a single cookie. This means our code only saw one cookie as well.

One-off Issue or Plague?

We wanted to see how prevalent the issue of Duplicate Cookies is. So we wrote some quick code and we then re-analyzed approximately 700 web performance scans we have already performed on other websites to see who else had the issue. We found 16 other websites, or around 2.5% of websites we had assessed had this issue. While it is by no means as common an issue as say Images without any caching information (Check #172) we were surprised at how common the issue is. Spot checking those 16 website shows the same fundamental issue: the same cookie getting set to the same value multiple times in a single HTTP response. Again, this is most likely caused by repeated execution of the same function or code path which sets the cookie value.

Since it is a fairly easy mistake to make and is not a one-off issue, we decided to promote this to a full fledged performance check. So we wrote Zoompf check: #316: Duplicate Cookies to detect this issue.

Want to see what performance problems your website has? Duplicate Cookies is just two of the 300+ web performance issues Zoompf detects when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today!