Bandwidth, Latency, and the “Size of your Pipe”
I talk about performance a lot and more often than not people say something like "I have a really good Internet connection, so websites are fast for me." What is a little unsettling was when a company told me "Our datacenter has a really fat pipe, so our website is fast." My reply, delivered with a bit of a smile , is "The size of your pipe doesn’t guarantee a satisfying user experience."
These conversations show that there is a fundamental misunderstanding about how data is transferred on the Internet. Technical and non-technical people are confusing or combining the separate concepts of bandwidth and latency. Understanding these 2 concepts is critical to understand the core ideas behind front-end web performance and optimization techniques.
If you think of the Internet as a series of tubes, latency is the length of the tube between two points. Bandwidth is how wide the tube is. Indeed, it’s named bandwidth because it describes the width of the communications band. The wider the tube the more data you can send in parallel. The key point here that gets missed is that, regardless of how much data you are sending, you still have to move it the distance from point A to point B. That takes time and that is the latency.
To better illustrate how latency is not affected by bandwidth, I will use an example from Stuart Cheshire’s excellent 1996 essay, "It’s the Latency, Stupid." The physical distance from Boston, Massachusetts to Stanford University in California is 4320 kilometers. The speed of light traveling through a fiber optic data cable is 200,000 km/s. (compared to 300,000 km/s in a vacuum). So the travel time for a single photon of light to travel across a direct fiber optic connection from Boston to Stanford is
4320 km / 200,000 km/s = 21.6 milliseconds. The round-trip time to Boston and back is 43.2 milliseconds. These are fundamental laws of nature. You will never get something to travel faster than that. There will always be a delay of at least 43.2 ms when Boston communicates with Stanford.
In practice, however, the latency delay is more than 43.2 ms. This is because we do not have a single continuous piece of fiber optic cable connecting Boston and Stanford. Instead, the path goes through several segments along the way where dozens of pieces of networking equipment add to the delay. Regardless of how big and powerful that Cisco router is, it is not functioning at the speed of light! Typically, the Boston to Stanford route is on the order of 75-85 milliseconds.
Remember, we have yet to talk about bandwidth, or connection speeds. You need to wrap your head around the fact that to send a single bit of data you will always have a delay caused by latency for the data to travel the physical distance to its destination. Having a 25 Mbps connection does not somehow allow a single bit data of data to travel that a distance any faster. A large bandwidth connection simply allows you to send or receive more data in parallel. The data still needs to travel to and from your computer. The diagram below illustrates this concept:
Here we see two connections: one which is low bandwidth and one which is high bandwidth. When transmitting a JPEG across both connections the latency for the first byte of data to travel from the source to the destination is the same. The high bandwidth connection downloads the file faster than the low bandwidth connection because more data can travel in parallel. So faster transmission, latency still there.
In the late 1990’s and early 2000’s, the impact of latency was less visible due to the fact that personal internet connections were quite slow compared to today. The latency was masked because the delay in sending a request and waiting for a response was much smaller than the total time it took to download all of the response. However, that is no longer the case. It is not uncommon for a browser requesting a small image to wait 100 – 150 milliseconds before spending 5 milliseconds to download the image contents. This means that latency is accounting for 90-95% of the total time to request and download the resource. This is tremendously inefficient!
So why should a front-end performance advocate care about latency? Because it’s the basis for many of your performance optimizations! In fact, the entire class of "reduce the number of HTTP requests" optimizations are all about avoiding latency.
Imagine downloading ten, 10 kilobyte files. The total download time would be
10a + 10b where
a is the delay due to latency and
b is the amount of time it takes to download one 10 kilobyte file. Now consider downloading a single 100 kilobyte file. The total download time would be
a + 10b.
A is the overhead of a single request and response while
10b is amount of time spent downloading the 100 kilobyte file. In this example, the time difference between downloading ten, 10 kilobyte files or a single 100 kilobyte file is
9a. If we were transmitting these files using HTTP between Stanford and Boston, that would be
85 ms * 9 = 765 milliseconds, or over 3/4 of a second saved! And that’s a best case scenario. We haven’t even started talking about congestion, transmission errors, QoS throttling, taking separate paths, or any of the other things that introduce even more latency. And we haven’t even begun to talk about CDN’s and latency. I’ll save that for another post.
Remember, latency is the amount time required to travel the path from one location to another. Bandwidth is how much data can be moved in parallel along that path