I talk about performance a lot and more often than not people say something like "I have a really good Internet connection, so websites are fast for me." What is a little unsettling was when a company told me "Our datacenter has a really fat pipe, so our website is fast." My reply, delivered with a bit of a smile , is "The size of your pipe doesn’t guarantee a satisfying user experience."
These conversations show that there is a fundamental misunderstanding about how data is transferred on the Internet. Technical and non-technical people are confusing or combining the separate concepts of bandwidth and latency. Understanding these 2 concepts is critical to understand the core ideas behind front-end web performance and optimization techniques.

If you think of the Internet as a series of tubes, latency is the length of the tube between two points. Bandwidth is how wide the tube is. Indeed, it’s named bandwidth because it describes the width of the communications band. The wider the tube the more data you can send in parallel. The key point here that gets missed is that, regardless of how much data you are sending, you still have to move it the distance from point A to point B. That takes time and that is the latency.
To better illustrate how latency is not affected by bandwidth, I will use an example from Stuart Cheshire’s excellent 1996 essay, "It’s the Latency, Stupid." The physical distance from Boston, Massachusetts to Stanford University in California is 4320 kilometers. The speed of light traveling through a fiber optic data cable is 200,000 km/s. (compared to 300,000 km/s in a vacuum). So the travel time for a single photon of light to travel across a direct fiber optic connection from Boston to Stanford is 4320 km / 200,000 km/s = 21.6 milliseconds. The round-trip time to Boston and back is 43.2 milliseconds. These are fundamental laws of nature. You will never get something to travel faster than that. There will always be a delay of at least 43.2 ms when Boston communicates with Stanford.
In practice, however, the latency delay is more than 43.2 ms. This is because we do not have a single continuous piece of fiber optic cable connecting Boston and Stanford. Instead, the path goes through several segments along the way where dozens of pieces of networking equipment add to the delay. Regardless of how big and powerful that Cisco router is, it is not functioning at the speed of light! Typically, the Boston to Stanford route is on the order of 75-85 milliseconds.
Remember, we have yet to talk about bandwidth, or connection speeds. You need to wrap your head around the fact that to send a single bit of data you will always have a delay caused by latency for the data to travel the physical distance to its destination. Having a 25 Mbps connection does not somehow allow a single bit data of data to travel that a distance any faster. A large bandwidth connection simply allows you to send or receive more data in parallel. The data still needs to travel to and from your computer. The diagram below illustrates this concept:

Here we see two connections: one which is low bandwidth and one which is high bandwidth. When transmitting a JPEG across both connections the latency for the first byte of data to travel from the source to the destination is the same. The high bandwidth connection downloads the file faster than the low bandwidth connection because more data can travel in parallel. So faster transmission, latency still there.
In the late 1990′s and early 2000′s, the impact of latency was less visible due to the fact that personal internet connections were quite slow compared to today. The latency was masked because the delay in sending a request and waiting for a response was much smaller than the total time it took to download all of the response. However, that is no longer the case. It is not uncommon for a browser requesting a small image to wait 100 – 150 milliseconds before spending 5 milliseconds to download the image contents. This means that latency is accounting for 90-95% of the total time to request and download the resource. This is tremendously inefficient!
So why should a front-end performance advocate care about latency? Because it’s the basis for many of your performance optimizations! In fact, the entire class of "reduce the number of HTTP requests" optimizations are all about avoiding latency.
Imagine downloading ten, 10 kilobyte files. The total download time would be 10a + 10b where a is the delay due to latency and b is the amount of time it takes to download one 10 kilobyte file. Now consider downloading a single 100 kilobyte file. The total download time would be a + 10b. A is the overhead of a single request and response while 10b is amount of time spent downloading the 100 kilobyte file. In this example, the time difference between downloading ten, 10 kilobyte files or a single 100 kilobyte file is 9a. If we were transmitting these files using HTTP between Stanford and Boston, that would be 85 ms * 9 = 765 milliseconds, or over 3/4 of a second saved! And that’s a best case scenario. We haven’t even started talking about congestion, transmission errors, QoS throttling, taking separate paths, or any of the other things that introduce even more latency. And we haven’t even begun to talk about CDN’s and latency. I’ll save that for another post.
Remember, latency is the amount time required to travel the path from one location to another. Bandwidth is how much data can be moved in parallel along that path


Hi Billy, nice article! I agree about the common misconception that fat pipe == fast connection. Still however, the statement that you will save ’3/4 of a second’ is in most of the cases not true. Sure, there is a latency for every image you download, but if you download them in parallel, then you still won’t have a higher overall latency than when you downloaded only one file. In other words: if 10a occur simultaneously, you will still have to wait only 1a, the same as with the combined larger file.
Willem,
Great point. I should have been more clear in the article. To keep things simple the
10avs1afigures apply to a single HTTP connection. In other words, if you have a single connection to the web server (and thus parallel downloading is not possible), the latency is additive. You are right though, browsers will download anywhere from 2-6 resource in parallel from a single site, so the benefit is less than 3/4 of a second.Perhaps you could edit this line: “over 3/4 of a second saved! And that’s a best case scenario. We haven’t even started talking about congestion…” and add that you only gain 3/4 seconds if you originally used only 1 HTTP connection..? Again, still great article
good starter article, thanks!
for folks looking for more reading: it is impossible to understand how long something will take without discussing parallelism (including domain sharding multipliers), CWND and IW (even in non-congested networks), pipelines, TCP handshakes, DNS, SSL handshakes (and their resumed varieties not to mention false start enabled ones), what SPDY can do for helping parallelism, and CWND-throttle.
You’re so right that there are lots of balls in the air before even considering the bandwidth constraints. And latency (i.e RTT) is generally only getting worse because of the growth of mobile connectivity and the fact that radios run much slower than the speed of light – so this is an important topic.
Had to laugh at that the big pipe/user experience analogy!
[...] user experience” Posted on februar 12, 2012 Sådan starter Billy Hoffman ud i sin artikel om båndbredde og latency. Artiklen stiller skarpt på opfattelsen af båndbredde i forbindelse med [...]