Using real life collaboration to explain HTTP Caching
Do you still “collaborate” with your team or project team using individual Microsoft Word and Excel files? Let’s hope not, but if you still do, or you remember when you did, then you already have a solid understanding of how HTTP caching – done correctly – can dramatically, and simply improve your website’s performance.
Here’s how we used to collaborate on a document or spreadsheet:
- Me (project manager): Do you have the latest spreadsheet for this mornings meeting?
- You (team member): Uh, let me see. Mine’s dated May 17th. Is that the latest one?
- Me: No, it’s been updated since then.
- You: OK, thanks. Can you email me the latest copy?
- Me: Sure, is your email address still firstname.lastname@example.org?
- You: Yes, that’s correct.
- Me: OK, sending it over now.
- You: OK, thanks.
- Me: Did you get the file?
- You: No, not yet. Are you sure you sent it to the right email address? email@example.com?
- Me: Yes, that’s the address I used.
- You: I still don’t have it. I wonder what’s wrong.
- Me: Check your spam folder. Sometimes attachments get marked as spam.
- You: Nope, not in spam. Wait, here it comes. OK, I got the email. Now I’m downloading the file. Wow, that file is big, and my connection seems a little flaky at the moment. Says “5 minutes remaining.”
- [five minutes later]
- You: Yup! There it is. OK, I’ve got it. It says May 18th. Can I still use that?
- Me: Yes!
- Me: Great, we can start the meeting now.
- [after the meeting] You: Hey, can I have the spreadsheet for the meeting this afternoon?
- Me: Yes! You already have it. It’s the same one. Just keep using that one!
OMG. Ouch. So painful, but that’s how we used to do it, before everything was “in the cloud” (**waves both hands around and speaks in a mystical voice**). Today, it’s much different, since we can share access to a Google Doc that everyone can see in their browser, and that’s always the latest version. The first conversation above simply does not have to happen.
Nor do we have to experience this pain on our websites if we implement proper HTTP caching. Here’s a layman’s version of how HTTP caching works, using a similar (but far less painful) conversation:
- You: Can I please get a copy of that file?
- Me: Yes, here’s the file. It’s valid until tomorrow at noon.
- You: Great, thanks. I’ll see you at the morning and afternoon meetings, and I know I can just use the same one.
- Me: Correct.
Much better! Now let’s apply this conversation to HTTP caching. First, it’s the user’s (client) browser that does the act of caching, and it’s the server that determines (a) if it will allow caching and (b) how that caching will work.
Instead, the server sends over what we call an
Cache-Control header, which tells the user’s browser “here’s how long that file is valid. You don’t need to download it again until after that date & time”. Then the user’s browser can just quickly check the header when it needs that file. If the file is expired, then the browser asks “Is this the latest file version?” If the server’s answer is “yes”, then the browser still doesn’t have to download the file. If the server’s answer is “no”, then the browser needs to download the file again.
The beautify of caching is that, as long as a file has not expired, the browser never has to check with the server to make sure it is still OK to use. This “conversation” never needs to happen! This means that HTTP caching can eliminate the need to request and download files from your web server, speeding up the user’s experience on your website.
And that’s what web performance is all about: making the user experience (aka UX) better through faster performance. However, just “turning on” HTTP caching to cause problems where client use stale content. Ideally, HTTP caching has to be designed fin rom the beginning, and written into the software code. For example, your code could automatically embed time-date stamps in the URL for a resource, allowing you to use HTTP caching without worrying about client using stale content. Performance, including the use of HTTP caching, is at its best when performance is part of the design process.