html5shiv and Serving Content From Code Repositories
Today I want to focus on a 3rd party library which wasn’t using compression because it leads to an important and often unmentioned performance lesson: Don’t link to resources inside of source code repositories.
<aside>. This is incredibly helpful because it allows websites to use these new semantic elements while retaining backwards compatibility. While html5shiv is open source and free to be copied and used anywhere, websites often link to a specific URL:
In fact, the html5shiv article on Wikipedia includes the following source code snippet, which references this URL.
<!--[if lt IE 9]> <script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script> <![endif]-->
I would estimate that 95% of the time I witness a site using html5shiv, they are loading it from
http://html5shiv.googlecode.com/svn/trunk/html5.js. Unfortunately, is this bad for performance for several reasons.
It’s Not Served Using HTTP Compression.
The reason I noticed this in the first place is that
It’s Not Cached.
Google’s web server instructs the browser (and all shared caches) to only cache this file for 180 seconds. 180 seconds! By the time you are done reading this article, a copy of html5shiv sitting in your cache would have expired.
It’s Not the Most Recent Version.
This branch of html5shiv is not actively developed. As Paul Irish mentions, development for html5shiv has shifted to github, under the care of Alexander Farkas and others. You can find the most recent version here:
(And no, don’t link to that either! It’s not minified, and the
This new version of the html5shiv has had numerous bug fixes and performance improvements. In addition, if you actually minified this new version of the html5shiv, you’d find it’s also smaller than older one on Google code. In short, the code at
html5shiv.googlecode.com is obsolete.
It Says It Supports Range Requests When It Really Doesn’t.
Since we are on the subject, I might as well bring out all the issues. When you request
html5shiv.js, the web server responds with an
Accept-Ranges header, indicating that it supports HTTP byte ranging (also known as partial responses) as shown in the screen shot below:
This means the server supports conditional requests, allowing you to download just a small portion of the file. This is helpful when the client has already downloaded some of the response, such as when recovering from a network error. As I wrote in a previous post, supporting partial responses is good for performance. Only
googlecode.com doesn’t support partial responses. I used REDBot (which is awesome) to detect this issue, and you can see the results here. If you send a request and specify a byte range using the
Range header, the web server still gives you the full response.
The Biggest Problem of all
While these are important issues, they are really just a symptom of a much bigger problem: You’re linking to a source code repository.
Think about that for a second. You are directly linking to a resource in an external source code repository, which you do not control! Giant sirens and flashing lights should be going off in your head.
First of all, because this is a source code repository, the contents could be updated at any time. This is why there is not a far-future
Expires header. To function properly, this URL can never have a far future
Expires header. This is most likely the reason that byte range requests fail as well. The content may change, and, even though
ETags could be used, the web server may not be able to determine whether it can return a subset of the (potentially new) response body. So, by the very nature of being a code repo, you have two performance optimizations that simply can never happen.
Secondly, this is a source code repository! How many times have you accidently checked in code to a repository that broke the build? I’ve done it lots of times. In fact, it’s so common, that many departments have funny forms of punishment when a developer breaks the build, like forcing them to wear an embarrassing hat or shooting them with automatic NERF machine guns (Skynet is nigh!)
(And the real reason this is an embarrassing hat? Comic Sans MS.)
Marco Arment, the creator of Instapaper, has a great podcast, Build and Analyze, over at 5by5. Sometimes he and Dan even talk about development, so I highly recommend it. One of the things he has advocated repeatedly on the show (including the most recent episode), is to always avoid external dependencies you cannot control. Direcly linking to a resource in someone else’s code repository is a perfect example of an external dependency that can easily be, and should be, avoided.
Finally, at the end of the day, we are dealing with a library that is 3854 bytes in length. The new version is 2337 bytes. When served with compression it’s only 1166 bytes. Why on earth are you linking to a 3rd party to serve 1166 bytes? In the time it takes your browser to do the DNS lookup on the
html5shiv.googlecode.com hostname, you could have already transmitted the file! Creating an HTTP connection to a 3rd party to download 1166 bytes is just silly and wasteful.
The Moral of the Story
The moral of the story is: “never never never link to a resource inside a source code repository”. This includes your own source code repository and 3rd party ones like googlecode or github. The characteristics inherent in a source code repository are fundamentally at odds with frontend performance best practices. Breaking code in a source code repository or version control system is much easier due to its dynamic nature. Additionally, you should not be exposing internal infrastructure systems like code repositories, version control systems, or continuous integration systems to the public internet. This is highly dangerous and increases your attack surface.
The Moral of the Story (part 2)
The other moral of the story is one I’ve written about before: “Trust, but verify”.
Top search results will tell you to link to the copy of
html5shiv.js hosted on Google’s servers. Wikipedia includes a code snippet which links to this file. And you might not think that’s a bad idea. After all this is Google we are talking about! “Let’s make the Web Faster” Google! The creator of awesome performance technologies like PageSpeed and mod_pagespeed and SPDY. The employer of frontend performance greats like Steve Souders, Patrick Meehan, and Mike Belshe. That Google should know what they are talking about.
Believe nothing, no matter where you read it, or who has said it, not even if I have said it, unless it agrees with your own reason and your own common sense.
You can trust, but verify.