As you further develop your skills as an engineer you’ll eventually come across the word “dither.” It’s likely not a word you’ve ever heard before. At least other audio jargon like “bounce,” “compress,” and “sample” have homonyms used in everyday life. “Dither” does, in fact, have a standard English definition: to hesitate or waver. But the word is seldom used today. In audio terms, dithering means something else, though it relates to the English definition. So, what is audio dithering? And why is it an important step when mastering? Let’s find out.
What is Audio Dithering?
Audio dithering is the process by which low-level noise is added to a higher resolution digital audio signal before lowering its resolution for playback devices. Doing so reduces quantization error or truncation distortion. This might sound like gibberish, so let’s further explain what this all means. During recording and mixing, most DAWs will set you up at a 32- or 24-bit depth. These numbers refer to the audio resolution, or how much information can be packed into each sample. Recording in a higher bit depth will make for a more dynamic, clear result. So, you might wonder then why dithering would be necessary. The answer: exporting and mastering. In fact, dithering is the last step of the mastering process and should be only added when downsampling. This is frequently noted as a common mastering mistake when mastering your own music.
Most digital audio playback formats (such as CDs) only yield a 16-bit resolution. This means your 24- or 32-bit mix has to cram itself into a smaller space. You have to export your digital audio to this format but you don’t want to lose the detail and dynamics of your mix during conversion. Audio dithering when mastering adds a low-level noise to the mix. This noise introduces quantization randomization. In other words, it prevents truncation distortion, which occurs when audio data at a higher bit-depth is rounded down to a lower one. Dithering, then, allows the audio to maintain some of its dynamics during the translation from higher to lower resolution. Without dither, the song loses some of its dynamic range, and quieter moments (such as fade-ins and fade-outs) are noticeably crunched and digitized (see image below).
A Bit on Noise Shaping
This all might seem counterintuitive at first. After all, dithering leaves you with some hissing background noise as a trade-off for the aforementioned truncation distortion. But this noise caused by dither can actually be reduced in the process via noise shaping. Noise shaping allows you to target the frequency of the low-level white noise and reduce its audible presence. These options sometimes come with names, such as Triangular, Rectangular, and Pow-r, each one offering different levels and shapes of dither. Regardless of your choice, you won’t remove all of this noise, as it’s still necessary when converting to a lower resolution. Still, you can try different options and shapes to find the best compromise.
Understand Audio Dithering Using Images
Even with this explanation, audio dithering can be difficult to grasp conceptually. Fortunately, dither is used in other digital formats as well, such as visual media. These days, most of us are familiar with HD TVs, so we’ll use them as an example. Millions of pixels make up the image on your TV screen. Each pixel (like an audio bit) contains a piece of visual information. When all this information comes together, it creates a much larger, crisp image. Now, what if we took that detailed image information and crammed it into a smaller, lower-quality space (i.e. one with fewer pixels)? Mathematically speaking, some of the information must get lost. The question is, which information, how much of it, and what does the resultant image look like?
This once high-quality image (left) now finds itself truncated into a smaller space (middle). Certain important pixels disappear while others blend together into a single blob. Suddenly, the image might be unrecognizable. This is where dithering saves the day. What if, instead of cramming the high-def image into a low-def screen, we altered or scrambled the data, adding random pixel variation to the image first? Instead of destroying chunks of pixels or blocking them into rigid sections, a random selection of pixels across the image gets left behind. As a result, much of the digital image (or what our brains could process as the image) remains intact (right). The same amount of data has been removed, but this variation allows enough important detail to remain.
Conclusion: Simplifying the Complex
It’s challenging to define audio dithering without swiftly diving into the technical. If you’re proficient in the ins and outs of bit-depth and sample rates, then you shouldn’t struggle too much with understanding audio dithering. On the other hand, truly grasping the concept requires some abstraction. And most people, not so familiar with digital audio processing, will only understand dithering in these abstract terms. Ironically enough, the process of audio dithering essentially does what we’ve just attempted to do: to break down complex data into simpler, more digestible information.
So that’s audio dithering! If you’re still confused, take some time to think about the concept in the abstract and eventually it will click. It also helps to look at and listen to examples of dithering online to solidify your understanding. If you’re using MasteringBOX to master your tracks, remember that the algorithm adds dithering when rendering in 16-bit resolution. Don’t waver, dither!