4 minute read
If you’ve clicked this subsection you’re probably wondering something like “how can an image lose 75% of its color and still look good?” It’s an understandable question, so here we’ll explore chroma subsampling in finer detail.
As previously mentioned, our eyes are more sensitive to changes in luma (brightness) information than they are to chroma (color) information. Knowing this, engineers came up with a clever scheme for reducing image data size. If some pixels in an image discarded their color data, then more storage space could be devoted to things like resolution or framerate.
Of course, you can’t just throw away color in an image. People would notice. So, chroma subsampling doesn’t just get rid of colors, it shares chroma values between pixels. If a pixel loses its color information, it borrows some from another pixel next to it. This doesn’t mean the pixels look the exact same, because they both retain their luma information. So, since most pixels vary at least a small amount in their recorded brightness, they will still appear slightly different, which preserves visual detail to the viewer.
The dark green color visible in the edge pixels of the letter is not actually present on real life object. It is merely a result of the chroma subsampling removing the chroma data of the blue pixels and replacing it with the chroma of the green pixels, but with the same luma information. There is some data loss, obviously, but in normal viewing conditions (when you’re not looking at massively zoomed in pixels), this will be hardly noticeable.
So how exactly does this work?
Chroma Subsampling is usually expressed in a three-part ratio, made up of the components J:a:b.
[Note: You might occasionally see a four-part ratio. In these instances, the fourth number represents the alpha (transparency) sample, and it will match the first number in the ratio. This annotation is less common, however.]
The J component is an indicator of how many pixels wide the compression reference block is. This is also a measure of how many luma samples there are in each row.
The a component is an indicator of how many samples of chroma are in the first row of the reference block.
The b component is an indicator of how many samples of chroma are in the second row of the reference block.
There are many different possible (and theoretical) chroma subsampling schemes, but we’ll only cover the three most common: 4:4:4, 4:2:2, and 4:2:0.
Technically 4:4:4 means no chroma subsampling is taking place. Every pixels stores its own color information, so 100% of captured color information is available in the image data.
This scheme is used in high-end film scanners and top-tier cinematic post-production workflows, where maximum image quality is necessary.
In this scheme, every two pixels share chroma data. This means that 50% of all captured color information is replaced, which reduces the required bitrate by approximately one-third. However, this does not yield much visual difference.
Many high-end digital video formats and interfaces use this scheme, including the popular ProRes 422. As cameras capable of 10-bit image recording have entered the market, codecs that utilize 4:2:2 chroma subsampling have shifted from being a coveted, rare feature, to a standard for many professional productions.
For 4:2:0 chroma subsampling, only two pixels in the first row keep their chroma information, while the rest (two in the first row, and all four in the second row) copy these chroma values. This scheme replaces 75% of all captured chroma data, but allows for a 50% data savings in the bitstream.
4:2:0 is certainly a more aggressive compression scheme than the previous examples, but that doesn’t mean it’s bad. In fact, it’s far more common than the other two. Some of the most common codecs, like H.264, use 4:2:0 subsampling as part of their specifications. Even some high-quality delivery formats, like some Ultra HD Blu-Rays, still use 4:2:0.
Your first thought might be that more color is always better. While this is true in terms of absolute visual quality, it might be false for other considerations, like cost and performance. For many applications, 4:4:4 color is simply beyond the scope of technical necessity, and for some workflows it’s purely overkill. For example, most web video platforms (like YouTube) encode all content to 4:2:0. So, if you spent a lot of time and resources shooting, editing, and delivering in a 4:4:4 codec, your work will look almost the same as someone who worked entirely in 4:2:0.
However, where high levels of chroma subsampling become an issue is when you want to do large amounts of VFX work or chroma keying. If an image loses too much color information, it can make these tasks really difficult, resulting in far worse quality. After all, your software can’t just invent extra chroma information to pull a clean key from a green screen. What’s in the image is all there is. So, if your workflow includes lots of compositing and color work, less chroma subsampling is definitely worth the data requirements. Just like with the other considerations for choosing a codec, the technical requirements of your workflow should be the deciding factor.
Unlock all 100,000 words of the Frame.io Workflow Guide and learn how the pros do workflow.
Video collaboration solved.