March 30, 2010

The Big Performance Improvement in IE9 No One is Talking About

Microsoft released a preview of Internet Explorer 9 last week. Much attention has been paid to its increased standards support as well as performance improvements such as GPU accelerated page rendering and a faster JavaScript engine. However a small blog post that has received virtually no commentary discusses a change with IE9 that may well be the biggest web performance improvement we will see with the new browser.

Internet Explorer Logo

In a post on the IE blog last week, Marc Silbey shares the structure of IE9’s new User-Agent string. In it he writes:

An important change for site developers to know is that IE9 will send the short UA string by default. This change improves overall performance, interoperability and compatibility. IE9 will no longer send additions to the UA string made by other software installed on the machine such as .NET and many others.

Specifically, IE9’s the User-Agent string will look like this: Mozilla/5.0 (compatible, MSIE 9.0; Windows NT 6.1; Trident/5.0. Contrast that my IE8 User-Agent which is: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; MDDC; InfoPath.2; .NET CLR 1.1.4322; .NET CLR 3.0.30729). (I have no idea why there is a “Media Center PC 5.0″ identifier in my User-Agent string. I have a Dell laptop using Vista Home). The obvious benefit of the IE9’s User Agent string is that it is shorter. In fact, it’s over 70% smaller (63 bytes opposed to 216 bytes). Why the difference?

The difference is all the superfluous junk on the end of the User-Agent string. From IE4 until IE8 other programs installed on your machine could append an identifier string about themselves to the User Agent string IE would send. Junk such as different .NET runtimes and version numbers, toolbars and browser add-ons, media center strings, tablet string, Office plug-ins, and more would all appear in the User-Agent. Yes, moving to a shorter User Agent string will reduce the amount of bytes IE9 must send with each requests will improve performance. We’ve known about the benefits of reducing an HTTP request to try and make it fit in a single packet. In fact Yahoo has a performance rule around excessively sized cookies.

Of course, there is no way we would write an entire blog post about how the User-Agent string for IE9 is only 63 bytes and try to claim with a straight face that the reduced request size is the most impactful web performance improvement that IE9 brings to the table. You will see that the real benefit of IE9’s shorter User-Agent string, and perhaps the biggest web performance optimization in all of IE9, has to do with HTTP compression and caching.

Caching Compressed Responses

HTTP compression is one of the most impactful performance improvements you can deploy for a web application resulting in bandwidth savings of 50-80% for text resources. However compression makes things more complicated for shared caches like caching proxies. Consider what happens if a browser that supports compression requests a URL and the caching proxy receives a compressed version of the response. Then the caching proxy receives a request for the same URL from a different user. Can it send the compressed version? To solve this, HTTP/1.1 added the Vary header. The Vary header allows web servers to essentially say “To generate this response, we used the URL and the value of the following HTTP request headers.” Caching proxies could then store a copy of a response and know that “This copy is for this URL when requested with this HTTP request header value.” You can think of a caching proxy as having an internal table of what resources are cached and under what conditions to serve the resource. If the incoming request URL matches the URL for a cached resource, and if the value of each incoming HTTP header that was specified by the Vary response header match the values of the HTTP headers used in the original request for the cached resource, then the cache can service the new request locally by returning the cached resource.

This method allows the caching proxy to properly handle both compressed and uncompressed versions of a resource. This means that caching proxies could store two different copies of a distinct URL (one compressed, one uncompressed) based on the Accept-Encoding header. Unfortunately there are some bugs in certain older web browsers where these browsers would send an Accept-Encoding header telling the web server they supported HTTP compression when they really didn’t. The biggest culprit was IE6 before SP1. This browser had a bug would it would internally cache the compressed version of a CSS or JavaScript response and would try (and fail) to directly process the compress bytes. There are also certain versions of Netscape 4.x that did not properly handle HTTP compression despite using Accept-Encoding to instruct the server that it did.

What was the solution? Web servers would first look for the presence of an Accept-Encoding header to determine if the browser making the request thinks it can accepted compressed resources. If so the web server would next examine the User-Agent header to determine if see the requesting browser is known to have HTTP compression bugs. Only if the browser says it can accept compressed responses and it is not one of the browsers that has compression bugs will the web server serve a compressed version of the resource. (In retrospect, modifying a perfectly working web server to work around a buggy web browser seems silly. However in the age before automatic updating software this was a reasonable approach).

Unfortunately, this seemingly small change has enormous ramifications.

Different Strokes For Different Folks

Since the web server is varying the response based on both the Accept-Encoding header and the User-Agent header, it needs to indicate this to any downstream caching proxies by using a Vary: Accept-Encoding, User-Agent header. Since the web server could potentially serve a different response based on the User-Agent string, the caching proxy cannot serve the same cached response to web browser’s with different user agent strings. So instead of matching the URL and the value of the Accept-Encoding header on incoming request, now the caching proxy must match the URL, the value of the Accept-Encoding header, and the value of the User-Agent header to be able to return a cached response. To understand the impact of change, consider the following scenario.

Tom and Nick work at the same company and their web traffic passes through a shared caching proxy. Tom is using Apple’s Safari web browser and Nick is using Mozilla’s Firefox web browser. Both of these browsers support HTTP compression and neither have any known compression bugs. Tom visits http://example.com/index.html. The web server returns a compressed response with Cache-Control and Expires headers to enable caching of the resource, and a Vary: Accept-Encoding, User-Agent header to let downstream caches know what values were used to determine the response. The caching proxy stores a copy of the response using the URL http://example.com/index.html and the Accept-Encoding request header value of gzip as the key.

Nick then visits http://examples.com/index.html. The caching proxy sees the request and attempts to service it. The caching proxy does have a cached copy for the URL Nick is requesting and Nick’s request also has the same Accept-Encoding header value as the cached copy. However Nick’s request was made with Safari’s User-Agent string and the cached copy was stored for a request with Firefox’s User-Agent string (due to the Vary: User-Agent value from the originating web server). So even though both web browsers support compression, neither have any caching or compression bugs, and the caching proxy already has a cached response that is perfectly usable by any modern web browser, the caching proxy cannot serve Nick the cached response. Instead Nick’s request is passed on to the web server and that response is also cached. The shared cache now contains two separately cached yet identical responses.

The problem gets worse. Tim, who works at the same company as Tom and Nick, uses Google’s Chrome web browser to request http://example.com/index.html the caching proxy again is not able to locally service the cache. This is because Chrome has a different User-Agent string from Firefox or Safari. Thus the shared cache ends up with 3 separately cached yet identical responses. Caching Proxies went from have to story 2 copies of each distinct URL (one compressed, one uncompressed) based on the Accept-Encoding header, to storing 2 * X copies of a distinct URL, where X is the number of distinct User-Agents.

500 Different Ways To Say The Same Thing

It’s clear from this example that the addition of Vary: User-Agent to a response can significantly reduce the performance of shared caching. How much is dependent on how many distinct User agent strings. Obviously different web browsers will have different user agent strings and there is nothing we can do about that. But what about different User-Agent strings for the same version of the same browser? Sadly, there can be multiple different User-Agent strings for the same version of the same browser. Often this occurs when the operating system is included in the User-Agent string. This means the same version of the same browser can often have half a dozen different user agent strings. While this exasperates the Vary: User-Agent shared caching problem, it is still manageable.

Unfortunately Internet Explorer proceeds to destroy any chance of managing the problem. This is because IE does not have a half dozen or so different User-Agent strings. It has hundreds! Here is a list of over 480 different IE8 User-Agent strings. All of those browsers are Internet Explorer 8 but each has a different User-Agent string due to of all those external programs, toolbars, and browser add-ons that append on junk.

In short, due to the hundreds of different User-Agent variations for the same fundamental versions of Internet Explorer, and the (shrinking) majority market share of IE, currently the use of a Vary: User-Agent HTTP header effectively nullifies shared caching.

Starting To Fix The Problem

IE9’s adoption of a shorter User-Agent string without any inclusion of 3rd party identification banners will significant help matters. This change brings IE9’s User-Agent string in line with other web browser User-Agent strings which only have a few variable components. These variables are:

  • Computer Architecture of the computer
  • Version number of the web browser
  • Operating System of the computer
  • Language of Operating System

Computer Architecture is largely stabilizing on a few distinct values for each browser vendor such as i686 or x64. A changing version number of the web browser, include minor build numbers or patch numbers, is only included by a few web browsers (most notably Google Chrome). This can make for a large number of different User-Agent strings for the same major browser version. However automatic updates having largely marginalized this problem: the majority of user’s receive and use an updated version of a browser with 3 weeks of its release. The operating system of the computer also has only a few distinct values, except in the Linux world with different distributions tend to insert their name into the User-Agent string (Ubuntu and Debian being the worst offenders). Finally is the OS language. Ideally this should not be included in the User-Agent string at all, but in the Accept-Language header. The impact of including the language in the User-Agent string is largely a non issue as users behind a shared caching proxy (either instead an ISP or a corporation) tend to speak the same language.

The end result is IE9, and other major browsers, tend to only have 5 or 10 distinct User-Agent strings for each major version.

Why Even Use Vary: User-Agent?

This is perhaps the best question. There are no modern browsers that have HTTP compression issues. The biggest problem browser, IE6, had its HTTP compression issues solved nearly 8 years ago with the release of IE6 SP1 in spring of 2002. All major web browsers that send an Accept-Encoding header do legitimately support compression. There is simply no reason to ever include a Vary: User-Agent header when dealing with a cachable resource. However its easy to see why we have this problem. There are 10 years worth of web servers out there that are configured to inspect User-Agent strings. Just look at Apache’s documentation and you can see why. Below is the sample configuration for Apache’s compression module. It uses regular expressions on the User-Agent to detect bad browsers and then includes a Vary: User-Agent header in the recommended configuration for their compression module. This nullifies any caching of compressible and cachable resources.

<Location / >
  # Insert filter
  SetOutputFilter DEFLATE
  # Netscape 4.x has some problems...
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  # Netscape 4.06-4.08 have some more problems
  BrowserMatch ^Mozilla/4\.0[678] no-gzip
  # MSIE masquerades as Netscape, but it is fine
  # BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
  # Make sure proxies don't deliver the wrong content
  Header append Vary User-Agent env=!dont-vary
</Location>

There is one situation where it is appropriate to have a Vary: User-Agent response header. Consider a dynamic HTML page where the web server might return different content to a mobile browser than it would for a desktop browser when serving the same URL. For those pages you would want to include a Vary: User-Agent header. However these dynamic pages are, by definition, not cachable and thus don’t harm shared caches like Caching Proxies. The rule is any resource that is cachable should not have a Vary: User-Agent header.

The Biggest Performance Improvement for IE9?

With widespread adoption of IE9, we move from a world where the major browsers (IE, Firefox, Chrome, Safari, Opera) are represented by several thousand distinct User-Agent strings to a world of roughly 50-100 distinct User-Agent Strings. That is a 10 or 30 times improvement or over an order of magnitude in improvement. We expect that shared caches like caching proxies should get roughly a 10-20 times improvement in hit rate for resources from web servers returning a Vary: User-Agent header.

This is why we say that Microsoft’s decision to remove 3rd party program identifiers from IE9’s User-Agent string could be the most impactful web performance optimization present in IE9. Other performance improvements in IE9 like the improved JavaScript engine or GPU rendering certainly are impressive. However, even 100%, 200%, or even 1000% improvements to those system will result in microseconds of improvement. A clean User-Agent string which drastically increases the hit ratio of shared caches will have increase performance measured in milliseconds or even seconds, while decreasing bandwidth costs and server load for the origin web server.

Want to see what performance problems your website has? Cachable Resource with Vary:User-Agent is just one of the 300+ web performance issues Zoompf can detect when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today or try our free performance scanning bookmarklet.

March 24, 2010

Performance Tip for HTTP Downloads

HTTPWatch has an interesting article today on their blog entitled “Four Tips for Setting up HTTP File Downloads.” They offer some great advice to make sure your downloadable files work across all browsers and are saved using the appropriate name. However they didn’t include a very important feature that all websites offering large file downloads should have: support for resumable downloads! As we will see this is an essential performance feature that improves user experience while reducing bandwidth costs.

Partial Downloads and the Range Header

HTTP/1.1 added many exciting features over HTTP/1.0. And while people are familiar with the more popular HTTP/1.1 performance enhancements such as default persistent connections or chunked encoding most are unaware of another performance enhancement: partial responses. HTTP/1.1 allows a client to request a certain piece of a resource. The client can use the Range header to tell the web server to serve only a subset of bytes from the total resource.

This seemingly odd and esoteric feature is actually quite powerful because it allows for browsers to resume HTTP downloads! Consider this scenario:

Diagram showing how a browser can use the Range header to resume an interrupted download

Here we see the browser is trying to download a large PDF file named “report.pdf” which is approximately 9 megabytes. The client issues an HTTP GET request and starts downloading the response. The web server used the Accept-Range header to indicated to the client that it allows for GET requests with the Range header to download pieces of this resource. After the client has downloaded a 2 megabytes, the browser experiences a problem. This could have been a caused by a number of issues. For example the computer might have momentary lost its network connection (common on wireless networks) or another program on the computer might have locked up and caused the browser’s connection to time out. Whatever the cause, the client had to close the HTTP connection it had with the web server. However, instead of having to redownload the entire PDF file, the client can use a range request to skip what it has already downloaded and only fetch the remaining data. The client does so by using a Range header to tell the web server to serve the contents of the PDF starting at the offset 2048000.

A few key points about resumable downloads and range requests:

  • Range requests allow the user to pause and resume the download inside their browser’s downloads window.
  • Resumable downloads can only work for static resources. They cannot be used with dynamically generated responses that change with each request or where you do not know the size of the resource ahead of time.
  • To indicate to the client that you allow range requests, you must send both a Content-Length response header and a Accept-Ranges response header.
  • IIS and Apache include the appropriate headers to support range requests by default when serving static files from the file system. These web servers will also automatically handle incoming Range header and serve the appropriate bytes.
  • If you are using PHP, ASP.NET, Ruby, or some other application logic to process the client’s request and deliver the downloaded file (as in /download.php?ID=123), you will need to manually add logic into your application code to handle range requests. This is not an easy task. Consider getting rid of your download handling application logic entirely and serve the file directly from the file system. This allows you to leverage the web server’s built in support for range requests.

Conclusions

Supporting resumable downloads is an easy way for you to save bandwidth while providing your users with a better browsing experience. You can view the HTTP headers for your downloadable resources to determine if you support resumable downloads using this service and making sure the “Show all server header fields” option is enabled. You can also use a browser plug-in like Live Headers to view the HTTP header coming from the server. Zoompf checks for this issue by looking for HTTP responses that have a Content-Disposition: attachment; header (indicating a downloadable file) but don’t have an Accept-Range header.

Want to see what performance problems your website has? Non-Resumable HTTP Download is just one of the 300+ web performance issues Zoompf detects when scanning your web application for performance issues. Get your instant free web performance assessment at Zoompf.com today!

March 23, 2010

Optimizing Gradients

Gradient backgrounds are a common visual element on modern websites. And while you can use browser specific CSS extensions to directly create gradient backgrounds with pure CSS the much more common method is to use an image. Traditionally, when a web designer wants to create a gradient to a webpage, they use a program like Photoshop or The Gimp to create a small image containing the gradient they want. They then use CSS to set this image as the background for a specific element and then use repeat-x or repeat-y to spread the gradient image along the entire length of the element. This provides us with the chance for a new performance optimization technique!

Consider the image above that a Sven, a long time Zoompf user, sent in. This image is 38 pixels wide and 91 pixels tall. As you can see there is a gradient that goes from dark gray at the top of the image to white at the bottom. This image is used on the website to provide a background gradient for section titles. The website’s CSS uses the repeat-x directive to fill the section title along the X-axis with the gradient. Which raises the question “Why is the image 39 pixels wide?” After all, the change in colors is along the Y-axis (from the top of the image to the bottom). The width of the image is meaningless. It adds nothing to the image. If you were to open this image in a program like Photoshop you would see that every pixel has the same color as the pixel to it’s left or right. It cannot be any other way. Otherwise when the image is repeated one after another along the X-axis you would see spots or seams where the colors were different. Since CSS is used to repeat this image in the X-axis direction, there is no reason for the image to be larger than 1 pixel wide! Converting the 38×91 pixel image to a 1×91 reduces the size of the file by over 65%!

Let’s try a real example. Consider this gradient from Yahoo. This image is actually a sprite of several different gradients! It is 5 pixels wide and 3302 pixels tall and is 3466 bytes. Again, we can confirm that the extra 4 pixels in width provide no value. All the pixels in this image have the same color value as the pixels to the right or left. Using Photoshop, we cropped the image to be 1×3302 pixels with a size of 1920 bytes. This is a savings of 45% with absolutely no loss to image quality or user experience!

We performed this optimization on a sample of 300 or so gradient images. On average we reduced file size by 30-70%. PNG images tended to see an improvement of 10-30%.

Browser Performance Impact

Are there any negative effects to using gradient images that are only 1 pixel in length along the axis they repeat? This is what Sven wanted to know when he emailed the image. Consider an image that’s 32 x 300 images. It is repeated along the X-axis using CSS and is used as a background for a <DIV> that is 96 pixels wide. To “draw” an image on the screen, the fundamental operation the browser is doing is copying the bytes of the image into the buffer of bytes that display the screen. That means to draw the gradient background the browser has to copy the bytes of the 32 x 300 pixel image into the video memory 3 times (because 3 times 32 is 96). Consider if we had a 1 x 100 pixel image. The browser has to copy the bytes of the 1 x 300 pixel image into the video memory 96 times (because 1 times 96 is 96). So, it would seem that using a 1 x 100 pixel image is 32 times slower than using a 32 x 100 pixel image. Remember we aren’t talking about browser repaint or reflow events (where the browser has to recalculate where all the text and images goes and then redraw). We are talking purely about how quickly the computer can draw an image.

It sounds like we might have a performance problem.

Only we don’t. Computers can move data around very quickly. We are talking nanoseconds (10^-9). Meanwhile, transferring data over Internet connections is typically measured in millisecond or microseconds (10 ^ -3 or 10 ^ -6). So even though smaller images may make the browser do more work, even 32 times more work, the difference is so infinitesimal that they are not noticeable to the user. The time saving you will get by reducing the amount of data you have to send over the slow network however is impactful and beneficial.

Conclusions

The fact that gradients using images rely on CSS to repeat the image allows us to reduce the width or height of the gradient image without any visual difference. Performance is improved because we have reduced the size of the image. This technique can be applied to all images that are used as gradients. If a gradient image repeats along the X-axis it should only be 1 pixel wide. If a gradient image repeats along the Y-axis it should only be 1 pixel tall. As we saw the the example gradient from Yahoo, even gradient images that are already quite small in terms of dimensions can drastically reduce their size even further by reducing the width or height to 1 pixel.

Want to see what performance problems your website has? Oversized Gradient Image is just one of the 300+ web performance issues Zoompf detects when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today!

March 22, 2010

Upcoming Conferences

Front-end Web performance is a growing space and several conferences are providing a forum for presentations and discussions about making the web fast. I’m excited to be a part of this trend and I’ll be presenting at a number of very cool conferences over the next two months. While two of these are physical conferences held in the United States one conference is an online conference that anyone can in the world can attend. Here is some information about upcoming conferences where I am presenting:

DevNation Atlanta

I’m be giving a great presentation at the touring DevNation conference’s stop in Atlanta on April 3rd entitled “Making the Web Fast.” It will serve as an introduction to frontend performance with a twist. Instead of just explaining optimizations such as caching, domain sharding, or image crunching we are going to do some live performance autopsies of real sites. We’ll see what they are doing right and what they are doing wrong and how much they could save by implementing some front-end optimization techniques. I will also reference some free tools and services, such as Zoompf’s free web performance scanning service, that attendees can use to speed up their own websites.

DevNation is a day long conference with some other great speakers focusing on web development. The conference’s venue is the Georgia Tech Research Institute at just off of Georgia Tech’s campus, which happens to be my alma mater. DevNation is an intimate conference with extremely cheap prices and it highly recommended to anyone in the Atlanta area who design, develops, or maintains web applications.

JSConf US 2010

For reasons I don’t entirely understand the awesome folks at JSConf asked me to present at JSConf 2010 about all the nasty stuff you can do with JavaScript. I will be sharing the stage with luminaries like Douglas Crockford (whose window.onerror is always defined, because nobody throws anything at Douglas Crockford and lives), Steve Souders, and John Resig. While they are all going to talk about important stuff I am there to talk about how to destroy the Internet using the evil side of JavaScript.

JSconf runs from April 17th and 18th with a hefty dose of partying and a whole other conference, ScurvyCon, on April 16th. The speaker line up is literally the best of the best. JSconf is already sold out and the tickets for ScurvyCon go on sale on April 4th. If you can’t make it JSConf is nice enough to post videos of all the talks after the conference so anyone can learn and enjoy.

Web Optimization Summit

The Web Optimization Summit is an especially cool conference. It physically takes place in Austin Texas on May 12th where you can attend and watch the speakers. But it is also a virtual conference, so anyone in the world can connect and attend the conference! I’ll be speaking on more advanced performance topic with a presentation entitled: Implementing a Web Performance Program without killing yourself. It addresses a problem we see time and time again at Zoompf: A developer or IT admin reads on of Steve’s performance books, or downloads YSlow, and takes it upon themselves, (and only themselves) to start optimizing website. Suddenly the app stops working because IT implemented HTTP caching without having proper changing control or file versioning. Or all the JavaScript and CSS in the development branch of source control has been minified. Or the design cannot be modified because all the original images have been crunched and indexed. Even worse these optimizations are hap hazard and not repeatable so site performance whips violently from fast to slow with each new publish. The talk explores how to add performance optimizations into your existing web development and deployment process. This allows for automated, repeatable performance optimizations that don’t add time to your development cycles.

The Web Optimization Summit has some other excellent speakers, such as Paul Irish and Kyle Simpson from Getify. The conference all takes place over a single day, is reasonably priced, and can be virtually attend from around the global.

Bribery

I’m very happy to be presenting at these conferences. If you are attending please stop on by and say hello. In fact, anyone who meets me and shows me a printed copy of their free Zoompf Web Performance Report with some feedback about the findings will get a special gift. More details on this soon. Hope to see you all at the conferences!

March 16, 2010

Free Performance Scanning Service and New Web Design

We have two big announcements for you all today.

The first announcement is that we have finished rolling out a complete web design for Zoompf.com. The awesome folks over at Zero G Creative did the design and we worked with them to get everything tweaked (include the Sisyphean task of making everything look pretty even in dead browsers like IE6) and to hook it into our new backend systems. We are very pleased with the results and we will be implementing addition changes in the coming weeks.If you run into any issues please let us know.

We are very proud of our second announcement. Today we are unveiling Zoompf’s new automated web performance scanning service. Simply plug in a URL and we scan your website for over 300 performance issues and then email you a pretty report in minutes. It’s completely free, so please use it as much as you’d like and as often as you’d like. We are experimenting with new features, such as PDF reports, load graphs, resource hierarchy graphs, and other cool features and will keep you posted.

For now enjoy the new Zoompf website and take your favorite websites for a spin with our free web performance scanning service!

March 8, 2010

META Refresh Nullifies Caching for IE6 and IE7

There has been some interesting discussion recently on the mailing list for Google’s Page Speed performance tool. Brian Brophy rediscovered a critical performance bug in Internet Explorer that Joseph Smarr had found nearly 3 years ago. Both Internet Explorer 6 and 7 are affected by this bug . IE8 is not affected.

To summarize, the bug is this: When a site uses a <META> refresh tag to send the visitor to a URL, IE6 and IE7 treat that as if the user had clicked the “Refresh” or “Reload” button on the browser. This means IE does use any items that are in the cache and instead re-requests everything on that page. In short, for IE6 and IE7, a <META> refresh will nullify any HTTP caching.

The word "META" written on a luggage tag

Its best to see an example. Let’s say we have a page, start.html, which contains a <META> refresh tag that redirects to main.html. The <META> Refresh tag looks like this <META http-equiv=”Refresh” content=”0;main.html”> Let’s say main.html has 3 images on it. All of those images are served with a far future Expires header. This means repeat visitors should have all 3 images referenced by main.html cached. Here is what happens:

  • The visitor clicks a link to start.html.
  • start.html uses a <META> refresh to send the visitor to main.html.
  • Visitor’s IE browser fetches main.html.
  • Visitor’s IE browser does not use the cached images. Instead it sends 3 conditional GET requests to the web server for the 3 images with If-Not-Modified headers.

There were already several reasons not to use a <META> tag to perform a refresh. Zoompf Check #99 (one of the first checks we wrote) flags on web pages that used <META> tag for redirects. Originally we flagged META refreshes because of it was a bloated and oversized solution as well all the problems <META> refreshes cause with web crawlers and accessibility. Zoompf’s remediation advice was to use an HTTP redirect and we flagged this as a low severity issue. In light of these IE performance problems, we have changed the severity to a high (which is the same severity as not using caching at all).

Want to see what performance problems your website has? META Refresh Tag Used As Redirect is just one of the 300+ web performance issues Zoompf detects when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today!

Useless Duplicate Cookies

In our last post where we described the 300 issues Zoompf checks your website for during its web performance asessment we said that the #1 way we discover new web performance issues is simply looking at web responses. This story is a perfect example of how that actually happens. Today (in fact, about 2 hours ago) we were helping a client optimize their site when we noticed a rather long HTTP Set-Cookie header. This is what we saw:

Now that is rather difficult to look at. So we cleaned up the code, trimmed out the expires and path information for each cookie declaration, and aligned each cookie name/value pair on its own line. This is the clean version:

Set-Cookie:
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],
cisession=a%3A4%3A%7Bs%3A10%3A%22session_id... [snip],

As you can see, the web application is setting the cisession cookie 9 separate times! And every time it gets set to the very same value. Now each distinct cookie name can only have one value. The web browser will use the last declaration. So this response needlessly sets the cookie 8 times. The original Set-Cookie header’s value was 3681 bytes long. But when you remove the first 8 cisession cookie declarations and instead only have 1 cisession cookie that size is reduce to 409 bytes, a reduction of 89%.

Well that’s a nice find. But then things got worse. This site used rotating cookie values where the value of the cookie is changes on each and every page (this is often done in banking and e-commerce applications to mitigate session hijacking). In this case that meant every page generated by PHP hadthese 9 cookie declarations. By identifying and resolving this problem we helped the client take 3 kilobytes of every HTML response! Now that’s a really nice performance optimization!

Cause of the Issue

This client had an online store. To uniquely identify each visitor and provide them with a shopping cart the application code had to set a session identifier for the visitor. They had a single function which would verify the client had a session identifier and set the new appropriate value. This function was called 9 separate times in different parts of the code during page generation. However the function did not check to see if the session identifier had already been set for this cycle. It just appended on a new cookie declaration. So every time a page was generated, 9 cookie declarations would be added on to the HTTP response.

This issue was hard to detect. Since the browser only uses the last declaration, HTTP requests back to the server only contain 1 cookie, not 9. For the same reason if you use a browser add-on to examine the stored cookies you will only see 1 cookie and not 9. In fact, we had to modify Zoompf’s code to detect this. The System.Net classes in Microsoft .NET were automatically collapsing the 9 redundant cookies into a single cookie. This means our code only saw one cookie as well.

One-off Issue or Plague?

We wanted to see how prevalent the issue of Duplicate Cookies is. So we wrote some quick code and we then re-analyzed approximately 700 web performance scans we have already performed on other websites to see who else had the issue. We found 16 other websites, or around 2.5% of websites we had assessed had this issue. While it is by no means as common an issue as say Images without any caching information (Check #172) we were surprised at how common the issue is. Spot checking those 16 website shows the same fundamental issue: the same cookie getting set to the same value multiple times in a single HTTP response. Again, this is most likely caused by repeated execution of the same function or code path which sets the cookie value.

Since it is a fairly easy mistake to make and is not a one-off issue, we decided to promote this to a full fledged performance check. So we wrote Zoompf check: #316: Duplicate Cookies to detect this issue.

Want to see what performance problems your website has? Duplicate Cookies is just two of the 300+ web performance issues Zoompf detects when scanning your web applications. Get your instant free web performance assessment at Zoompf.com today!

March 5, 2010

Details of Zoompf Performance Issues

When talking with web developers and front end designers we almost always get asked these two questions: “Do you really have 300 Checks?” and “What performance issues does Zoompf look for?” (The answer to the first question is: No, we actually have 315. However not all of them are specifically for front-end performance issues). In this post we will detail what issues Zoompf detects while assesses your web applications and help illustrated how Zoompf’s deep and broad analysis compares with other front-end performance tools.

(Of course, you can see how awesome Zoompf’s performance analysis is right now! Anyone can receive a free performance scan of their website)

To understand and appreciate the scope of Zoompf’s analysis it is helpful to create categories of different performance issues. This way we can discuss the typical performance issues that people test websites for and showcase all the additional issues Zoompf detects. In our reports, Zoompf groups performance issues into 4 broad categories based on what the desired goal for solving each problem. The four performance categories Zoompf uses are:

  • Reducing Response Size
  • Reducing Request Count
  • Maximizing Browser Performance
  • Server and Miscellaneous Issues

Let’s examine what these categories mean and list examples of the performance issues fall into each category.

Reducing Response Size

Reducing response is all about minimizing the number of bytes that have to be pushed down the network pipe to the client. Typically examples for issues in this category are things like using HTTP compression, minifying CSS JavaScript files, and crunching images. Other tools tend to check only for these obvious issues.

Zoompf however goes further to find more bloated content and unnecessary data that can be removed to reduce the of a web response. Zoompf not only has standard HTML minification checks (#38, #44) but also tells you HTML content that should be completely removed such as: <TABLE> tags that are used for layout purposes (#111, #299, #301); unnecessary or redundant content such as <META> tags used for caching, or character set info, or other meta data and multiple page elements like DOCTYPES (#170, #302-305, #97); common style attribute or onX event attributes that can be communized into single declarations (#283, #28-#30); Excessive ASP.NET ViewState (#212); and more.

We find ways to reduce the size of content in other types of files by finding issues like: Unused CSS rules (#33); Flash or Silverlight applications that has not been compressed, or compiled with debugging symbols, or contain uncrunched images, or aren’t using assembly caching (#148, #149,#231 #232, #256 ,#257); content that can be rezipped (#230); Already compressed Content using HTTP compression (#58); PNG8 Candidate images (#284, #285) and more.

As of today Zoompf detects 103 different issues so you can optimize your web content to be as small as possible without sacrificing features or compatibility. Reducing response size provides a good improvement to page load times and a larger impact on operational resources like bandwidth and server load.

Reducing Request Count

Reducing request count issues are all about how to reduce the number of HTTP requests needed to render the page. Typical examples for this category include things like combining CSS or JavaScript files, CSS sprites, and using HTTP caching.

Again Zoompf goes beyond the status quo and detects even more ways that you can reduce the request count for a web page. This includes things like: hyperlinks and images that can be converted to client-side image maps (#185); server–side image maps (#169); wasteful redirects due to no trailing slash, or to a default page, or to the WWW/non-WWW or SSL version of the site (#129, #204, #247, #248); resources that should be cached by caching proxies but aren’t due to query strings, URL contents, or conflicting and misconfigured cache headers (#68, #191-#197, #36); news feeds that aren’t using caching, or blackout periods, or Last-modified support (#225-229, #233, #235); Style sheets that only import other style sheets (#264, #269); external JavaScript files with no executable content (#37); and many more.

As of today Zoompf checks for 70 different issues that will reduce the number of requests your web server must handle per page. These issues have an enormous effect on page load times and an equally massive effect on bandwidth consumption and network usage.

Maximizing Browser Performance

Maximizing browser performance is all about using correct features and organizing content to allow the browsers to render the page and execute the content as fast as possible. This includes obvious things like domain sharding and cookie-free domains, avoiding CSS expressions or AlphaImageLoader, and properly placing reference to external JavaScript or CSS files.

Again Zoompf goes beyond typical front-end tools and detects issues such as: <SCRIPT> that blocks rendering (#286, #152); Images or objects without dimensions (#237 #238, #262); <CANVAS> issues (#291); out-of-date and poorly performance JavaScript libraries like older jQuery or Google Analytic’s urchin.js (#272, #293); downgrades to HTTP/1.0 (#56); JavaScript code performance issues (#158, #160, #221, #222); premature persistent connection closure (#177, #178); and more.

Zoompf currently checks your web application for 41 different issues which decrease the performance of your visitors’ browser and which directly lead to slower page load times and application functionality.

Server and Miscellaneous Issues

The “Server and Miscellaneous Issues” category contains issues which reduce the performance of the web server or which waste the server’s resource. While front-end optimization techniques can offer enormous performance gains there are many easy to fix server issues that can be detected by simply crawling the web server. Missing or misconfigured Robots.txt files with suboptimal rules or no crawl-delay (#7, #289, #102, #91); broken or incorrect content given the way it was references (#40-43, #253-255); application specific issues like misconfigured server-side object caches like memcached or WP Super Cache (CMS or PHP op-code caching systems(#239-#245); and a few more.

As of today Zoompf scans for 36 different issues that decrease the performance of the server or needlessly waste its resources.

Other Issues Zoompf Detects

There are 2 other categories of issues that Zoompf looks for. The first category consists of quality issues. Zoompf classifies quality issues as blatant errors or critical problems with your website’s functionality. We include quality issues because you really shouldn’t be trying to finding and fix performance issues while your web application is fundamentally not working properly. We mentioned quality issues before in our post about supporting other languages. These quality issues detect problems such web server errors (broken status codes, misconfigured modules, SSL certificate issues, etc); application tier issues (stack traces, framework exceptions, and unexecuted server-side source code, etc) and database errors and exceptions. In all Zoompf detects 37 different quality issues. While Zoompf is not meant to replace QA testers we want to alert you to any critical or broken functionality on your website that we detect.

We call the final group “Prototype Checks.” These look for all different kinds of issues, such as obscure or developing performance issues, search engine optimization best practices, usability and accessibility issues, browser compatibility issues, and even web security issues. We have written these checks so we can gather different pieces of data or statistics about these issues in all the web assessments we do. We do not include any of these issues in our web performance reports or in any other public reports (though this might change). Some of these issues do get promoted to performance or quality checks, while others allow us to play with new ideas or technologies.

Follow the Leader

So now you know what performance issues Zoompf checks for and how we provider a richer and deeper analysis of your web application than other tools. In fact, other people are starting taking cues from Zoompf on new performance issues to include in their tools. When Google released version 1.6 of PageSpeed in February they added support to find a new performance issue : Specify a Character Set Early. This is something we researched and blogged about late last year and Google even references Zoompf work as source material for including the new check!

Front-end performance is a very young and exciting space. We will continue to discover and publicize new techniques to optimize your website’s performance. It’s going to be a fun ride!

Experience the Matrix

Much like The Matrix, reading about what performance issues Zoompf detects is nothing compared to experiencing what performance issues Zoompf can detect for yourself. Right now you can go to Zoompf.com can get a free performance assessment of your website. What things will we find that other tools have missed? Think you have a fully optimized website that cannot possibly be improved? Try Zoompf’s free performance assessment today and find out!

March 4, 2010

Zoompf Check #300! Or: Gateway’s got a problem…

People often ask how we discover new performance issues. Without a doubt the #1 way we discover new issues is simply by looking at websites and seeing how they work. Not a customer engagement goes by where we don’t find at least one new web performance issue that we add to our growing database of web performance issues. This why Zoompf has added over 150 performance checks since we went public with our offerings. In this blog post we are going to give you the back story around a cool performance problem we found. In fact, it was so interesting and impact it became our 300th check in our database of web performance issues!

Picture of a cork popping out of a bottle

The Strange Case of CSS Resources

We recently made a few changes to our CSS parser and we were testing the new features against a few dozen pages to ensure we hadn’t broken anything. While testing, we noticed something odd on Gateway’s website. All of the background images in one of the style sheets for gateway.com were served over an encrypted SSL connection. In other words, the style sheet http://www.gateway.com/css/cms/styles.css is served over a plain unencrypted HTTP connection. However it includes links to background images like https://cdn.gateway.com/media/cphp/themes/default_bg.gif which are served over an encrypted SSL connection. What’s odd is the web server will serve those background images just fine if you request them using HTTP instead of HTTPs. It’s even weirder since very little of Gateway’s website even uses SSL. In fact, trying to access the root of the Gateway’s website using HTTPS will redirect you to the HTTP version.

So, certainly something weird, but is this a performance issue? After all, the developers just added a few extra “s” characters to their CSS. So what if maybe a little bit of SSL gets used. What are the performance implications of that?

Actually, they are huge!

SSL Primer

HTTPS is HTTP traffic sent over an encrypted SSL tunnel. SSL is expensive for 2 reasons (documented here and here). The first is the SSL handshake, where it is negotiated between the client and the server what protocols will be used and keys are exchanged. This involves the use of asymmetric key encryption with is extremely math heavy and quite slow. Then there is bulk data encryption, which is where the server is using symmetric key encryption. Symmetric key encryption is much faster than asymmetric key encryption, but can be much slower than sending unencrypted data. (How much slower is beyond the scope of this article. Suffice it to say that the performance impact of SSL is sufficiently large that there is an entire market for SSL acceleration products).

So what’s the impact in this situation? Well, the extremely expensive initial SSL negotiation must be done for the two initial connections to the web server. Each of these negotiations takes between 250 milliseconds to 350 milliseconds. This is in addition to the overhead of making the TCP connection and the negative impact of TCP’s Slow start and congestion control. Even reusing pre-negotiated keys a connection close is expensive, typically costing between 60 milliseconds and 100 milliseconds.

The Performance Impact

That doesn’t sound too bad but see how it affects Gateway. Examining this waterfall graph for www.gateway.com shows that 1.5 seconds, or over 25% of the page load time for Gateway.com, is due to the overhead of SSL! It’s hard to believe but the simple additional of 14 “s” characters to the page caused 1.5 seconds of delay! Further more Gateway’s web server is going to be working about twice as hard to push encrypt and serve those resources! Server load, power, cooling, and availability are all adversely affected.

(Notice there are a lot of other problems with Gateway’s website that contributed to the damage the SSL issue caused. Specifically they were not using persistent connection which caused a dozen of the smaller re-use negotiations to occur. Luckily Zoompf Check #177 and #178 detected all these closed persistent connections!).

Wow. So how did this happen? It could be for a lot of reasons, all of them fairly simple to make or innocent. It could have been a simple typo. It could have been left over from a redesign where that style sheet was used exclusively inside of the SSL portion of the site. It could be that this style sheet is used in both the SSL and non-SSL portions of the site, but to avoid a mixed content warning, the resources are referenced using SSL. The solution to this is simple. The style sheet should use protocol relative URLs to reference the images. That way if the style site is in the SSL portion of the website (and referenced using an https URL) the background images will requested using HTTPS. And when the style sheet is in the non-SSL portion of the website (and referenced using an http URL) the background images will be requested using HTTP.

It is both cool and scary that such a simple issue can have huge performance implications. And unfortunately none of the free tools like YSlow or PageSpeed would have alerted you to the issue. Until now.

It is with great pleasure and pride that we announce Zoompf Check #300: SSL Resources on Non-SSL Page. That this is such a simple issue to have and at the same time it causes such as massively negative impact makes it worthy to be our 300th check.

Want to see what performance problems your website has? Finding SSL Resource on Non-SSL pagesis just one of the 300 issues Zoompf detects when analyzing your web applications. Get your instant free mini web performance assessment at Zoompf.com today!