From Alert Email to Backup Fail to Performance Problem
If you got an alert email from your performance monitoring solution telling you your site was now loading 600 ms slower, what would you do? What is your process to figuring out what happened and what you need to do to fix it?
This is the story of how one of our customers came to use Zoompf. And it all started with an alert email from Pingdom to their development team. The alert email notified them that the page load time for their website had increased by around 600 ms. The lead engineer (name changed to protect the innocent, let’s call him “Han”) was on duty that weekend, so it was his job to figure it out.
The first thing Han checked was the backend services. Perhaps there was a database query or a code path that was taking longer than normal. Han’s company used an agile development process and published new versions of the site several times a week. Maybe something bad slipped past QA. However, the application tier and database tier were operating normally and there was nothing in the logs. To be sure, Han looked through the commit log in source control. None of the changes stood out as something that would effect page load time significantly.
Next, Han checked the web servers. CPU load was normal. Traffic levels were normal. However the network I/O measurements were higher than normal. In fact, the web server cluster was transmitting about 35% more data than normal. That’s strange. Why are Traffic levels the same, but the bytes out had increased significantly? It was this key piece of data that lead Han to the problem.
Within a few minutes, Han updated the config file, and the site began serving static assets with HTTP compression again. Page load time dropped back to normal level, and the site was fast again.
This was a perfect example of how something beyond your control, like IT restoring from a backup, can impact your website’s performance. How would IT know if their change impacted performance? If you get an alert, how do you figure out the source of the problem? Han did a admirable job in a tough situation, but it still took several hours to find and fix the problem. You and your team may be performance experts, but you have better things to do with your time. And it’s hard to do manual work efficiently when you are under the pressure and in the moment of a site that has become slow.
This is where Zoompf can help. Zoompf automates the analysis of your website to tell you the specific problems that are affecting a page. Even better, our free Zoompf Alerts product continuously scanning your website throughout the day, looking for specific front-end performance issues, and alerts you when new problems are introduced. If Han had been using Zoompf Alerts, we would have received an email as soon as the old server configuration had been deployed. Sound like something that could help you? You can join Zoompf Alerts now for free!