Zoompf's Web Performance Blog

Note: Archived Content

This is the archived version of the Zoompf blog. Since our acquisition by Rigor, all our new research and posts on web performance are being published on The Rigor Blog

From Alert Email to Backup Fail to Performance Problem

 Billy Hoffman on July 21, 2015. Category: Zoompf Alerts

If you got an alert email from your performance monitoring solution telling you your site was now loading 600 ms slower, what would you do? What is your process to figuring out what happened and what you need to do to fix it?


This is the story of how one of our customers came to use Zoompf. And it all started with an alert email from Pingdom to their development team. The alert email notified them that the page load time for their website had increased by around 600 ms. The lead engineer (name changed to protect the innocent, let’s call him “Han”) was on duty that weekend, so it was his job to figure it out.

The first thing Han checked was the backend services. Perhaps there was a database query or a code path that was taking longer than normal. Han’s company used an agile development process and published new versions of the site several times a week. Maybe something bad slipped past QA. However, the application tier and database tier were operating normally and there was nothing in the logs. To be sure, Han looked through the commit log in source control. None of the changes stood out as something that would effect page load time significantly.

Next, Han checked the web servers. CPU load was normal. Traffic levels were normal. However the network I/O measurements were higher than normal. In fact, the web server cluster was transmitting about 35% more data than normal. That’s strange. Why are Traffic levels the same, but the bytes out had increased significantly? It was this key piece of data that lead Han to the problem.

It turns out, earlier that day, there had been a hard drive failure on one of the nginx servers that handled static content. The IT/Ops team provisioned a new drive and restored from backup. After some digging and research, Han was able to determine that the restored web server configuration file was actually out-of-date. It was an older version whose The HTTP compression settings were not optimized, and so the web site was serving all its CSS, JavaScript, and JSON responses without HTTP compression. This is why the bandwidth usage was up, way up. Since more content had to be transmitted, it took longer to download and render the page, which is what was slowing down the page load, and what ultimately triggered the alert email Han received.

Within a few minutes, Han updated the config file, and the site began serving static assets with HTTP compression again. Page load time dropped back to normal level, and the site was fast again.

This was a perfect example of how something beyond your control, like IT restoring from a backup, can impact your website’s performance. How would IT know if their change impacted performance? If you get an alert, how do you figure out the source of the problem? Han did a admirable job in a tough situation, but it still took several hours to find and fix the problem. You and your team may be performance experts, but you have better things to do with your time. And it’s hard to do manual work efficiently when you are under the pressure and in the moment of a site that has become slow.

This is where Zoompf can help. Zoompf automates the analysis of your website to tell you the specific problems that are affecting a page. Even better, our free Zoompf Alerts product continuously scanning your website throughout the day, looking for specific front-end performance issues, and alerts you when new problems are introduced. If Han had been using Zoompf Alerts, we would have received an email as soon as the old server configuration had been deployed. Sound like something that could help you? You can join Zoompf Alerts now for free!


Have some thoughts, a comment, or some feedback? Talk to us on Twitter @zoompf or use our contact us form.