Cloudinary Blog

How JPEG XL Compares to Other Image Codecs

By
How JPEG XL Compares to Other Image Codecs

A year ago, I talked about JPEG XL at ImageCon 2019. It’s time for an update.

Brief Recap

JPEG XL is a next-generation image codec currently being standardized by the JPEG Committee. Based on Google’s PIK codec and Cloudinary’s Free Universal Image Format (FUIF) codec, JPEG XL has created a sum that’s greater than the parts by leveraging the best elements of Google PIK and FUIF:

JPEG XL evolution

  • From PIK, JPEG XL has reaped the focus on strong psychovisual modeling and preservation of detail and texture, as well as decode speed, in particular enabling parallelization and efficient cropped decoding of huge (gigapixel or more) images.
  • From FUIF, JPEG XL has brought in the cornerstone on being responsive by design and universal.
  • From PIK and FUIF, JPEG XL has learned to put a tremendous emphasis on being legacy friendly, delivering a smooth transition from existing file formats (notably JPEG, but also PNG, GIF, and TIFF) to JPEG XL.

The key features of JPEG XL are described in a recent white paper published by the JPEG Committee.

Advantages of JPEG XL

This section highlights the important features that distinguish JPEG XL from other state-of-the-art image codecs like HEIC and AVIF.

No Royalties

Troll Image attribution: Bitmovin

Although they can never make guarantees (patent trolls can always suddenly wake up), the contributors who created JPEG XL have agreed to license the reference implementation under the Apache 2.0 license. That means that, besides being Free and Open Source Software (FOSS), JPEG XL also comes with a royalty-free patent grant.

That’s not at all the case for the High Efficiency Image File Format (HEIC), which is based on the HEIF container, on which Nokia claims patents; and the High Efficiency Video Coding (HEVC) codec, which is a complete patent mess. For the AV1 Image File Format (AVIF), the patent situation looks better since it’s based on AV1, and being royalty free was a major goal of the Alliance for Open Media, which created AV1. It is not clear, though, to what extent AV1 is actually royalty free. Moreover, AVIF is based on the HEIF container so Nokia patents might apply as well.

Legacy Friendliness

You can transcode existing JPEG files effectively and reversibly to JPEG XL without any additional loss. Not so for previous attempts at creating “next-generation” image formats, such as JPEG 2000, JPEG XR, WebP, and now HEIC and AVIF. Transcoding to one of those other formats requires decoding the JPEG image to pixels and then re-encoding those pixels with the other format—an irreversible process that leads to generation loss.

To me, legacy friendliness is an important feature that facilitates a smooth transition from JPEG to a successor format without requiring a transition period in which two versions of every image—the old JPEG file and the new “successor-format” file—must be stored to satisfy the long tail of users who haven’t upgraded yet. Such a requirement completely defeats the purpose of improved image compression.

Encoder diagram

Responsive Design

Especially for web delivery, it would be desirable to avoid having to store and serve multiple variants of the same image according to the viewer’s viewport width. Equally desirable is an option to progressively decode images, showing a low-quality image placeholder when only a few hundred bytes have arrived and adding more detail as the rest of the data shows up. JPEG XL ably supports both nice-to-haves.

As a rule, image formats that are based on video codecs do not support those two worthwhile features because the concept doesn’t make much sense for a single video frame. WebP (based on VP8), HEIC (based on HEVC), and AVIF (based on AV1) only offer sequential decoding, i.e., the image loads at full detail from top to bottom, and you must wait until it has almost been completely transferred before getting an inkling of the image content.

Slow loading

High Fidelity

JPEG XL is particularly effective with image compression at perceptual qualities that range from visually nearly lossless (in a side-by-side comparison), over visually completely lossless (in a flicker test, which is stricter than a side-by-side evaluation), to mathematically lossless. A lot of effort has been mounted to preserve subtle texture and other fine image details. Even though you can take advantage of the codec at lower bitrates, where the degradation becomes obvious, it really shines at the relatively higher bitrates.

In contrast, image formats based on video codecs tend to excel at the very low bitrates: they can produce a nice image in just a few bytes. The image looks good at first but, on closer inspection, often seems weirdly “plasticy,” e.g., skin complexion becomes very smooth as if the compression applied a huge amount of foundation cream, or “distilled” like an oil painting. That’s acceptable for video codecs: you need a low bitrate to keep the file size or bandwidth reasonably low and, since frames are typically only shown for less than 40 milliseconds, the audience usually doesn’t notice such artifacts. For still images, however, a higher quality is often desired.

low-bitrate HEIC Illustration: low-bitrate HEIC can smooth out a lot of fine image details

Fool-Proof Default Quality

The JPEG XL reference encoder (cjpegxl) produces, by default, a well-compressed image that is indistinguishable from (or, in some cases, identical to) the original. In contrast, other image formats typically have an encoder with which you can select a quality setting, where quality is not really defined perceptually. Consequently, one image might look fine as a quality-60 JPEG while another might still contain annoying artifacts as a quality-90 JPEG.

high fidelity PNG Original PNG image (2.6 MB)

high fidelity JPEG XL JPEG XL (default settings, 53 KB): indistinguishable from the original

high fidelity WebP WebP (53 KB): some mild but noticeable color banding along with blurry text

high fidelity JPEG JPEG (53 KB): strong color banding, halos around the text, small text hard to read

Zooming in a bit, you can see how JPEG XL preserves the text even better than a five-times- larger quality-95 JPEG, which still emits some subtle discrete cosine transform (DCT) noise around the letters. At a similar compression rate, HEIC, WebP, and JPEG look significantly worse than JPEG XL for this image.

Original
JPEG XL (53 KB)
JPEG q95 (253 KB)
HEIC (55 KB)
WebP (53 KB)
JPEG (53 KB)

Internally, JPEG XL leverages a novel, perceptually-motivated color space called XYB. Most other codecs still use the YCbCr color space, usually with chroma subsampling. YCbCr, which is rooted in analog color television, is a relatively crude and somewhat dated attempt at modeling human color perception. Part of YCbCr’s problem is lack of precision, especially in the dark colors and in the blues and reds. That’s why dark video scenes are often a terrible blocky mess.

Thanks to its more accurate color handling, JPEG XL is better at avoiding color-banding issues—even in those difficult darks.

Original PNG (1.3 MB)
Original
(brightened for clarity)
JPEG XL
(4 KB, brightened for clarity)
HEIC
(4 KB, brightened for clarity)
WebP
(4 KB, brightened for clarity)
JPEG
(5 KB, brightened for clarity)

Universality

JPEG XL handles numerous image types, including regular photographs; illustrations; cartoons; computer-generated images; logos; user-interface elements; screenshots; maps; medical imagery; images for printing, e.g., Cyan Magenta Yellow Black (CMYK) with additional spot colors; scientific images; satellite images; game graphics; huge images (gigapixel or even terapixel); tiny icons; images with alpha transparency, selection masks or depth information; layered images; and so on.

Apropos of workflows, you can leverage JPEG XL not only as a web-delivery format, but also as a local storage and an exchange format for authoring workflows, for which fast and effective lossless compression and high bit depth are important. In terms of functionality and compression, JPEG XL fully supersedes JPEG, PNG, GIF, WebP, and TIFF.

In contrast, video codec-based formats tend to have limitations that do not matter for video but that might impact still images in terms of dimensions, bit depth, number of channels, and types of image content.

Format Maximum Image Dimensions
(in a Single Code Stream)
Maximum Bit Depth,
Maximum Number of Channels
JPEG 4,294 megapixels (65,535 x 65,535) 8-bit, three channels (or four for CMYK)
PNG Theoretically 4 exapixels
(but no way to efficiently decode crops)
16-bit, four channels (RGBA)
WebP 268 megapixels
(16,383 x 16,383)
8-bit, four channels (RGBA)
HEIC 35 megapixels 1
(8,192 x 4,320)
16-bit, three channels
(alpha or depth as separate image)
AVIF 9 megapixels 1
(3,840 x 2,160)
12-bit, three channels
(alpha or depth as separate image)
JPEGXL 1,152,921,502,459 megapixels
(1,073,741,823 x 1,073,741,824)
24-bit (integer) or 32-bit (float),
up to 4,100 channels

  1. HEIC and AVIF can handle larger images but not directly in a single code stream. You must decompose the image into a grid of independently encoded tiles, which could cause discontinuities at the grid boundaries. Illustration: grid boundary discontinuities in a HEIC-compressed image. HEIC and AVIF can handle larger images but not directly in a single code stream. You must decompose the image into a grid of independently encoded tiles, which could cause discontinuities at the grid boundaries. Illustration: grid boundary discontinuities in a HEIC-compressed image.

    Grid boundry


    Computational Complexity

    You can encode or decode modern video codecs like AV1 and HEVC in software, but the computational cost is high, especially for well-optimized encoding. Dedicated hardware is desirable or even required to efficiently implement such codecs. In contrast, you can easily encode or decode JPEG XL in software on current hardware. The speed results in the table below are based on four CPU cores.

    Codec Encoding Speed (MP/s) Decoding Speed (MP/s)
    JPEG (libjpeg-turbo) 49 108
    HEVC (HM) 0.014 5.3
    HEVC (x265) 3.7 14
    JPEG XL 50 132

    Advantages of AVIF and HEIC

    You might conclude from the above that AVIF and HEIC are pointless. That’s not true; they have three important strengths.

    Very Low Bitrates

    Both AVIF and HEIC can reach very low bitrates yet still produce presentable images. For all that, obviously, a lot of the image information has vanished, the compression artifacts are much less bothersome than those of JPEG.

    For applications for which bandwidth, storage reduction, or both of those factors are the main concern, i.e., they are more important than image fidelity, AVIF and HEIC might come in handy. On the other hand, if bandwidth is the major issue, then you might also desire progressive or responsive decoding, which AVIF and HEIC do not support.

    Animation and Cinemagraphs

    Even though you can create animation in JPEG XL, it offers no advanced video-codec features, such as motion estimation. JPEG XL compresses better than GIF, APNG, and animated WebP but cannot compete with actual video codecs for production of “natural” video. Even for a three-second looping video or cinemagraph, where most of the image is static, actual video codecs like AV1 and HEVC can compress much better than still-image codecs.

    Support and Availability

    HEIC already works well far and wide in the Apple ecosystem. No matter that HEIC doesn’t yet function in the Safari browser as a web image format, it does already support HEVC as a video codec.

    AV1 shines as a video codec in the Google Chrome and Firefox ecosystems, and AVIF could follow suit. With the influential Alliance for Open Media as its sponsor, AV1 counts among its many proponents giant enterprises. Furthermore, hardware devices for AV1 are already available.

    HEIC and AVIF are now on tap. JPEG XL is still in the final stages of standardization, however, and does not yet work in browsers.

    Current Status of JPEG XL

    The JPEG Committee is a working group of the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC). The standardization process takes time, involving multiple stages of balloting, in which the draft specification is scrutinized by various national-standards bodies, including the American National Standards Institute (ANSI) in the U.S., Deutsches Institut für Normung e.V. (DIN) in Germany, the Japanese Industrial Standards Committee (JISC) in Japan, etc.

    The main stages of the standardization process are New Project (NP), Working Draft (WD), Committee Draft (CD), Draft International Standard (DIS), Final DIS (FDIS), and International Standard (IS).

    ISO Process

    The JPEG XL standard will consist of four parts:

    • Part 1 (the main part), which describes the codestream (the actual image codec), is currently in the DIS stage.
    • Part 2, which describes the file format (the container that wraps the codestream and additional metadata or extensions), has just proceeded to the CD stage.
    • Part 3, which describes the procedure for testing conformance of JPEG XL decoders, is in the WD stage.
    • Part 4, which is the reference implementation, is also in the WD stage.

    If everything goes as planned, an International Standard for part 1 will be available at the beginning of 2021; for the other parts, at the end of 2021.

    In practice, once the process reaches the FDIS stage, the spec is “frozen” and you can use JPEG XL for real. Nonetheless, it’ll still take time and effort to garner software support for JPEG XL in applications and on platforms. Dethroning the old JPEG will not be a trivial task, as evidenced by several failed attempts in the past. We are hopeful that we’ll succeed this time around and that we’ve created a worthy successor to a 30-year-old image format that’s as old as the World Wide Web, that’s older than Google, that’s twice as old as Facebook and Twitter, and that’s three times as old as WhatsApp, Instagram, and Cloudinary.

    Here’s hoping that, once it becomes a standard, JPEG XL will last for 30 years, too!

    For a deep dive into the topics on the current visual web, see the Cloudinary 2020 State of the Visual Media Report, or download the full report below.

Recent Blog Posts

Our $2B Valuation

By
Blackstone Growth Invests in Cloudinary

When we started our journey in 2012, we were looking to improve our lives as developers by making it easier for us to handle the arduous tasks of handling images and videos in our code. That initial line of developer code has evolved into a full suite of media experience solutions driven by a mission that gradually revealed itself over the course of the past 10 years: help companies unleash the full potential of their media to create the most engaging visual experiences.

Read more
Direct-to-Consumer E-Commerce Requires Compelling Visual Experiences

When brands like you adopt a direct–to-consumer (DTC) e-commerce approach with no involvement of retailers or marketplaces, you gain direct and timely insight into evolving shopping behaviors. Accordingly, you can accommodate shoppers’ preferences by continually adjusting your product offering and interspersing the shopping journey with moments of excitement and intrigue. Opportunities abound for you to cultivate engaging customer relationships.

Read more
Automatically Translating Videos for an International Audience

No matter your business focus—public service, B2B integration, recruitment—multimedia, in particular video, is remarkably effective in communicating with the audience. Before, making video accessible to diverse viewers involved tasks galore, such as eliciting the service of production studios to manually dub, transcribe, and add subtitles. Those operations were costly and slow, especially for globally destined content.

Read more
Cloudinary Helps Minted Manage Its Image-Generation Pipeline at Scale

Shoppers return time and again to Minted’s global online community of independent artists and designers because they know they can count on unique, statement-making products of the highest quality there. Concurrently, the visual imagery on Minted.com must do justice to the designs into which the creators have poured their hearts and souls. For Minted’s VP of Engineering David Lien, “Because we are a premium brand, we need to ensure that every single one of our product images matches the selected configuration exactly. For example, if you pick an 18x24 art print on blue canvas, we will show that exact combination on the hero images in the PDF.”

Read more
Highlights on ImageCon 2021 and a Preview of ImageCon 2022

New year, same trend! Visual media will continue to play a monumental role in driving online conversions. To keep up with visual-experience trends and best practices, Cloudinary holds an annual conference called ImageCon, a one-of-a-kind event that helps attendees create the most engaging visual experiences possible.

Read more