Cloudinary Blog

One pixel is worth three thousand words

By
How various image formats compress one-pixel images

A couple of months ago while taking a break from implementing cool new features like q_auto and g_auto, I was joking in our team chat about how well various image formats “compress” one-pixel images. In response, Orly — who runs the blog — asked me if I’d write a post about single-pixel images. I said: "Sure, why not. But it will be a very short blog post. After all, there’s not much you can say about a single pixel."

Looks like I was wrong. Very wrong.

What can you do with one pixel?

Back in the early days of the web, one-pixel images were widely used as a poor man’s solution to do things we now do with CSS. Spacing, creating lines or rectangles, semi-transparent backgrounds: there’s quite a lot you can do by simply scaling one pixel to arbitrary dimensions. Another use of one-pixel images, still a common practice today, is as a web beacon, for tracking or analytics.

In responsive web design, one-pixel images are often used as temporary placeholders while the page is loading. Since most browsers do not support client-hints, some responsive image solutions wait for the page to fully load in order to determine the actual rendered image sizes, and then replace a one-pixel image with the right breakpoint image using JavaScript.

Broken image example

There is one other use of single-pixel images: they can be used as ‘default’ images. If for whatever reason the actual image that you want to show cannot be found, it might in some cases be better to hide that fact (by showing one transparent pixel) than to return a “404 - Not Found” error, which will usually be rendered by browsers as a “broken image” icon. In both cases, you don’t get to see the intended image, but it might look a bit more professional if you don’t ‘rub it in’ by showing a broken image icon.

OK, it looks like one-pixel images do have some uses. So, what’s the best way to encode a 1x1 image?

Obviously, this is a fringe case for image compression formats. If the “image” only consists of a single pixel, there sure is not a lot of data to compress. In fact, the uncompressed data is just one bit to four bytes – depending on how you interpret the data: black & white (1 bit), grayscale (1 byte), grayscale + alpha (2 bytes), RGB (3 bytes), or RGBA (4 bytes).

But you can’t encode just the data. In any image format, you need to specify how to interpret the data. At the very least, you need to know the width and height of the image, and the number of bits or bytes per pixel.

Headers

Typically, to encode the width and height, four bytes are used: two bytes per number (if it were only one byte, the maximum image dimension would be 255x255). Let’s say that we need another byte to encode the color type of the image (e.g. grayscale, RGB or RGBA). In this minimalistic image format, a single-pixel image would take at least 6 bytes (e.g. for a white pixel) and at most 9 bytes (for a semi-transparent, arbitrary color pixel).

However, actual image formats tend to have a “header” that contains quite a bit more information. First of all, the first few bytes of any image format contain a fixed identifier that is only there to say “Hey! I’m a file in this particular file format!”. This fixed sequence of bytes is also known as the magic number. For example, a GIF file always starts with either GIF87a or GIF89a (depending on which version of the GIF spec is used), a PNG file always starts with an 8-byte sequence that includes PNG, JPEG files have a header that contains the string JFIF or Exif, and so on.

Headers can contain all sorts of meta-information about an image. Some of it is format-specific information to indicate what kind of subformat is used, and is necessary to decode the pixels correctly. Some of it might not be necessary to decode the pixels, but is still useful to know how to render them – e.g. color profiles, orientation, gamma, or dots-per-pixel. Some of it might be arbitrary metadata, like comments, timestamps, copyright notices, or GPS coordinates. These things might be optional, or they might be obligatory; it depends on the format specification. Of course all of this metadata has some cost in terms of file size. So let’s focus on “minimal” files, where all of the non-obligatory metadata has been stripped. Otherwise we might be wasting precious bytes on silly things.

Besides headers, image formats may have other kinds of “overhead”. They may contain all kinds of markers and checksums, intended to make the format more robust in case of transmission errors or other forms of corruption. Also, sometimes some kind of padding is required, to ensure that the data gets aligned properly.

One-pixel images – the smallest possible images – reveal exactly how much “overhead” there is in an image format. Let’s take a look.

Here is a hexdump of a 67-byte PNG file, representing a 1x1 white pixel:

Copy to clipboard
00000000  89 50 4e 47 0d 0a 1a 0a  00 00 00 0d 49 48 44 52  |.PNG........IHDR|
00000010  00 00 00 01 00 00 00 01  01 00 00 00 00 37 6e f9  |.............7n.|
00000020  24 00 00 00 0a 49 44 41  54 78 01 63 68 00 00 00  |$....IDATx.ch...|
00000030  82 00 81 4c 17 d7 df 00  00 00 00 49 45 4e 44 ae  |...L.......IEND.|
00000040  42 60 82                                          |B`.|

This file consists of the 8-byte PNG magic number, followed by a header chunk (IHDR) which contains 13 bytes, an image data chunk (IDAT) with 10 bytes of “compressed” image data, and an end marker (IEND). Every chunk starts with a 4-byte chunk length and a 4-byte chunk identifier and ends with a 4-byte chunk checksum, and these three chunks are obligatory, so that’s another 36 bytes, for a total file size of 67 bytes.

A black pixel is also 67 bytes in PNG; a fully transparent pixel is 68 bytes, and an arbitrary RGBA color will be between 67 and 70 bytes.

JPEG has a longer header. The smallest one-pixel JPEG is 160 bytes (Update: 141 bytes). And it cannot be transparent, because JPEG does not support an alpha channel.

GIF is the most compact (in terms of headers) amongst the three universally supported image formats. A white pixel can be encoded as a valid GIF file in just 35 bytes:

Copy to clipboard
00000000  47 49 46 38 37 61 01 00  01 00 80 01 00 00 00 00  |GIF87a..........|
00000010  ff ff ff 2c 00 00 00 00  01 00 01 00 00 02 02 4c  |...,...........L|
00000020  01 00 3b                                          |..;|

and a fully transparent pixel can be done in 43 bytes:

Copy to clipboard
00000000  47 49 46 38 39 61 01 00  01 00 80 01 00 00 00 00  |GIF89a..........|
00000010  ff ff ff 21 f9 04 01 0a  00 01 00 2c 00 00 00 00  |...!.......,....|
00000020  01 00 01 00 00 02 02 4c  01 00 3b                 |.......L..;|

Note that for all of the above formats, you can come up with even smaller files that will still decode to a one-pixel image in all or most browsers, but they are not valid with respect to the format specifications, which means that an image decoder might at any time complain (rightfully) that the file is corrupt, and show the broken image icon which we were trying to avoid.

So what’s the best format for a one-pixel image on the web? That depends. If it’s an opaque pixel, then the answer is GIF. If it’s a fully transparent pixel, then the answer is also GIF. But if it’s a semi-transparent pixel, then the answer is PNG, since GIF only supports all-or-nothing transparency.

Not that all of this matters very much. All of these files fit easily in a single network package, so in practice, there is no real speed difference – and the storage needed for this is negligible anyway. But still, it’s an amusing thing to look at, at least for image format geeks like me.

What about other, more exotic file formats?

If you use WebP for one-pixel images, be sure to use lossless WebP. A single-pixel lossless WebP image is between 34 and 38 bytes. A single-pixel lossy WebP image is between 44 and 104 bytes, depending mostly on whether there’s an alpha channel or not. For example, this is a fully transparent pixel as a 34-byte lossless WebP:

Copy to clipboard
00000000  52 49 46 46 1a 00 00 00  57 45 42 50 56 50 38 4c  |RIFF....WEBPVP8L|
00000010  0d 00 00 00 2f 00 00 00  10 07 10 11 11 88 88 fe  |..../...........|
00000020  07 00                                             |..|

and here is the same pixel as a lossy (default) WebP of 82 bytes:

Copy to clipboard
00000000  52 49 46 46 4a 00 00 00  57 45 42 50 56 50 38 58  |RIFFJ...WEBPVP8X|
00000010  0a 00 00 00 10 00 00 00  00 00 00 00 00 00 41 4c  |..............AL|
00000020  50 48 0b 00 00 00 01 07  10 11 11 88 88 fe 07 00  |PH..............|
00000030  00 00 56 50 38 20 18 00  00 00 30 01 00 9d 01 2a  |..VP8 ....0....*|
00000040  01 00 01 00 02 00 34 25  a4 00 03 70 00 fe fb fd  |......4%...p....|
00000050  50 00                                             |P.|

The main difference between the two, is that a lossy WebP with transparency is actually stored internally as two images, thrown together into one container file: one lossy image for the RGB values, and one lossless image for the alpha values.

BPG

For Bellard’s BPG format, which also has a lossless and a lossy mode, it’s the other way around. The lossy BPG encoding of a single white pixel is 31 bytes, the smallest we’ve seen so far:

Copy to clipboard
00000000  42 50 47 fb 00 00 01 01  00 03 92 47 40 44 01 c1  |BPG........G@D..|
00000010  71 81 12 00 00 01 26 01  af c0 b6 20 bc b6 fc     |q.....&.... ...|

The lossless BPG for the same white pixel is 59 bytes. However, a fully transparent pixel is 57 or 113 bytes as a lossy or lossless BPG, respectively. Interestingly, for a single white pixel, BPG wins versus WebP (31 byte BPG vs 38 byte WebP), but for a single transparent pixel, WebP wins versus BPG (34 byte WebP vs 57 byte BPG).

FLIF

And then there’s FLIF. As the main creator of the Free Lossless Image Format, obviously I cannot forget about that one. Here’s a 15 byte FLIF file for one white pixel:

Copy to clipboard
00000000  46 4c 49 46 31 31 00 01  00 01 18 44 c6 19 c3     |FLIF11.....D...|

And here’s a 14 byte file for a black pixel:

Copy to clipboard
00000000  46 4c 49 46 31 31 00 01  00 01 1e 18 b7 ff        |FLIF11........|

The black pixel file is one byte smaller because the number zero happens to compress better than the number 255. The header is pretty simple: the first four bytes are always “FLIF”, the next byte is a human-readable indication of the color and interlacing type. In this case it is “1”, which means we have just one color channel (i.e. it’s a grayscale image). The next byte indicates the color depth: “1” means one byte per channel. And the next four bytes are the image dimensions, in this case 0x0001 by 0x0001. The last four or five bytes are the actual compressed data.

One fully transparent pixel is also 14 bytes in FLIF:

Copy to clipboard
00000000  46 4c 49 46 34 31 00 01  00 01 4f fd 72 80        |FLIF41....O.r.|

In this case, we have 4 color channels (RGBA) instead of just one. You might expect the data section to be longer in this file (after all, there are four times as many color channels), but that’s not the case: since the alpha value happens to be zero (it’s a fully transparent pixel), the RGB values are considered irrelevant so they don’t end up being encoded at all.

For an arbitrary RGBA color, the FLIF file can be up to 20 bytes.

OK, so FLIF is the clear winner in the “one pixel” category of some weird image encoding competition. If only this were an important thing to compete at :)

Actually, no. FLIF isn’t the winner. Remember the minimalistic (and non-existent) image format I mentioned in the beginning? The one that would encode single-pixel images in 6 to 9 bytes? Well that format doesn’t exist, so I suppose it doesn’t count. But there is an image format that does exist, and which gets quite close to that.

It’s called the Portable Bitmap format (PBM), and it’s an uncompressed image format from the 1980s. Here’s how you could encode a single white pixel as a PBM file in just 8 bytes:

Copy to clipboard
00000000  50 31 0a 31 20 31 0a 30                           |P1.1 1.0|

Actually, forget about the hexdump, this is a human-readable file format. You can open it in a text editor if you want (at least this particular subformat):

Copy to clipboard
P1
1 1
0

The first line (“P1”) indicates that this is a black & white image. Not grayscale; there are only two colors: black (which confusingly gets the number 1) and white (0). The second line indicates the image dimensions. And then it’s just a whitespace-delimited list of numbers, one number per pixel. So in this case just the number 0.

If you need something other than pure white or black, you can use the PGM format to get one pixel in any other shade of gray in just 12 bytes, or the PPM format to get any RGB color in just 14 bytes. This is always smaller than the corresponding FLIF file (or any other compressed format, for that matter).

The traditional PNM family (PBM, PGM and PPM) does not support transparency. There is an extension of PNM though, called Portable Arbitrary Map (PAM), which does support images with transparency. Unfortunately for our current purposes, its syntax is quite a bit more verbose. The smallest valid PAM file that encodes a fully transparent pixel, is the following:

Copy to clipboard
P7
WIDTH 1
HEIGHT 1
DEPTH 4
MAXVAL 1
TUPLTYPE RGB_ALPHA
ENDHDR
\0\0\0\0

On the last line there are four zero (NULL) bytes. The above file is 67 bytes. You might be tempted to use grayscale+alpha instead of RGBA, because that would save two bytes in the data section. But that results in a 71 byte file, since you have to change the TUPLTYPE from RGB_ALPHA to GRAYSCALE_ALPHA. Oh and by the way, your image software might not like the use of MAXVAL 1, so you might need to change that to MAXVAL 255 (which takes two more bytes).

So all in all, for one-pixel images, when there’s no transparency involved, PNM is the smallest (8 to 14 bytes for PNM vs 14 to 18 bytes for FLIF), but when there is transparency, FLIF is smallest (14 to 20 bytes for FLIF vs 67 to 69 bytes for PAM).

Here is a summary table that gives the (optimal) file sizes for various one-pixel images:

 

white

black

gray

yellow

#FFFF00

transparent

semitransparent

#1337BABE

PNG

67

67

67

69

68

70

GIF

35

35

43

35

43

/

JPEG

160

160

159

288

/

/

Lossy WebP

44

44

44

64

82

92

Lossless WebP

38

34

38

36

34

38

Lossy BPG

31

31

29

36

57

62

Lossless BPG

59

59

37

124

113

160

FLIF

15

14

15

18

14

20

PNM/PAM

8

8

12

14

67

69

It might seem a bit surprising that an uncompressed image format actually beats most of the compressed formats at this particular task. But it’s not that surprising if you think about it. One-pixel images are in a sense the worst-case scenario for image compression: they’re all headers and overhead, and very little data. And the very little data there is cannot really be compressed because compression depends on predictability, and how are you supposed to predict one single pixel?

In part two of this blog post I will discuss the other extreme. How well do extremely predictable single-color images perform in various formats? Stay tuned….

Update: Check out part two as well: A one-color image is worth two thousand words

Recent Blog Posts

Our $2B Valuation

By
Blackstone Growth Invests in Cloudinary

When we started our journey in 2012, we were looking to improve our lives as developers by making it easier for us to handle the arduous tasks of handling images and videos in our code. That initial line of developer code has evolved into a full suite of media experience solutions driven by a mission that gradually revealed itself over the course of the past 10 years: help companies unleash the full potential of their media to create the most engaging visual experiences.

Read more
Direct-to-Consumer E-Commerce Requires Compelling Visual Experiences

When brands like you adopt a direct–to-consumer (DTC) e-commerce approach with no involvement of retailers or marketplaces, you gain direct and timely insight into evolving shopping behaviors. Accordingly, you can accommodate shoppers’ preferences by continually adjusting your product offering and interspersing the shopping journey with moments of excitement and intrigue. Opportunities abound for you to cultivate engaging customer relationships.

Read more
Automatically Translating Videos for an International Audience

No matter your business focus—public service, B2B integration, recruitment—multimedia, in particular video, is remarkably effective in communicating with the audience. Before, making video accessible to diverse viewers involved tasks galore, such as eliciting the service of production studios to manually dub, transcribe, and add subtitles. Those operations were costly and slow, especially for globally destined content.

Read more
Cloudinary Helps Minted Manage Its Image-Generation Pipeline at Scale

Shoppers return time and again to Minted’s global online community of independent artists and designers because they know they can count on unique, statement-making products of the highest quality there. Concurrently, the visual imagery on Minted.com must do justice to the designs into which the creators have poured their hearts and souls. For Minted’s VP of Engineering David Lien, “Because we are a premium brand, we need to ensure that every single one of our product images matches the selected configuration exactly. For example, if you pick an 18x24 art print on blue canvas, we will show that exact combination on the hero images in the PDF.”

Read more
Highlights on ImageCon 2021 and a Preview of ImageCon 2022

New year, same trend! Visual media will continue to play a monumental role in driving online conversions. To keep up with visual-experience trends and best practices, Cloudinary holds an annual conference called ImageCon, a one-of-a-kind event that helps attendees create the most engaging visual experiences possible.

Read more