Visualizing image optimizations with hex editors and strings
I’m a big fan of image optimizations, and have written several posts about it. Images dominate the web in terms of both byte size and request count. Luckily, they are super easy to optimize with free and/or open source tools. For lossless image optimization, you can expect to consistently reduce the size of your image files by 5%-20% without impacting image quality. And images have embedded thumbnails, you can see savings upwards of 50%-70%. This is a great way to lose page load wait by losing page content weight.
While its great to tell someone that an image can be reduced by 20%, I have found that sometimes it is helpful for people to actually visualize what that savings looks like. To actually you show them all the waste and bloat and unneeded crap that is sitting inside their JPEG or PNG image. Is there a way to do this? Yes! Today I will show you 2 ways: With a hex editor, and with the
Seeing image bloat with a hex editor
Hex editors lets you examine the contents of binary files. They also can show you bloat in your losslessly optimized images. There are some desktop hex editors like HxD or 0xED, and there are also web-based hex editors. Any hex editor can be used, provided that it includes the ability to display the ASCII output of the bytes.
Let’s use this background image from Weather.com. Right now the size of this image is 218 KB, which seems a little high considering its dimensions are 1 x 620. Let’s load this image into a hex editor:
Most hex editors feature a column on the right, which shows the ASCII output of the bytes in the file. Immediately we can see English text, taking about Adobe Photoshop CS 5. Needlessly to say, you shouldn’t be seeing large strings of English text inside of an image file.
As we scroll down, we start to see more English text, and then we see embedded XML!
All of this junk is exactly the type of stuff you want to remove from the images on your website. This data has nothing to do with the graphical data for the image. It is simply increase the size of the file, wasting bandwidth and needlessly increasing download times.
We found the hex view so effective in showing people image bloat, we built it into our products. Our commercial product Zoompf WPO, and our newly launched free public beta of Zoompf Alerts, both include a “View optimization as hex” feature. In fact, the screen shots above are from the public Zoompf Alerts account for Weather.com!
Seeing image bloat with Strings
There is another way to visualize image bloat: the
strings command. Strings is a utility that extracts and displays runs of printable text inside of a file. So if, say, at least 5 sequential bytes in all representing printable ASCII characters, strings assumes it is valid text and displays it. Strings is in many ways a quick and dirty way to display that ASCII text column we see in hex editors.
Let’s run Strings on the Weather.com background image we used above. Since the
strings command can generate a lot of text, we will pipe the output to
less so we can easily scroll through the output. We can do all of this via the command line, with the following command:
strings input.jpg | less
Here is the output. I have scrolled down and we can clearly see the same embedded XML text that is needlessly bloating this image:
strings was originally a Unix utility, it appears in most standard Linux or BSD distributes, Mac OS X, and for Windows with environments like Cygwin. I personally prefer a hex editor, but strings is usually already installed on most computers I use, so it is handy to use for quick demonstrations.
Optimizing Weather.com’s JPEG
Since we know this image from Weather.com has a lot of bloat it in, we should probably losslessly optimize it. We can do that using
jpegtran. Specifically we run the following command:
jpegtran -copy none optimize -outfile optimized.jpg input.jpg
The resulting file is only 508 bytes! Since the original image was 218 kilobytes in size, that is a savings of 98%!
I have left something rather important out of my explanation above. Hex editors and the
strings command can help you visualize bloat in images, but they mainly help you visualize bloat from embedded meta data and other text content. Missing palette entries, using bad DEFLATE settings or JPEG quantization options, even embedded thumbnails are all examples of ways images can be bloated that don’t appear as English text inside of the image file. This means that hex editors and the
strings command can’t always show you all of the sources of image bloat. However, large blocks of embedded text is a common source of image bloat, and so I still find these tools very useful.
If you care about web performance improvements like lossless image optimizations, you’ll love our new Zoompf Alerts beta. Zoompf Alerts continuously scans your website through the day, looking for specific front-end performance issues, and alerts you when new problems are introduced. We just launched the public beta of Zoompf Alerts and you can join Zoompf Alerts now for free!