Understanding Dithering in Audio & Its Importance in Mastering

As you further develop your skills as an engineer, you’ll eventually come across the word “dither.” It’s likely not a word you’ve ever heard before. At least other audio jargon like “bounce,” “compress,” and “sample” have homonyms used in everyday life. “Dither” does, in fact, have a standard English definition: to hesitate or waver. However, the word is seldom used today. So, what is dithering? And why is it an important step when mastering? Let’s find out.

What is Dithering in Audio?
Understand Dithering Using Images
Types of Dithering
Relationship Between Bit Depth and Sample Rate
Some Facts about Dithering
Common Myths About Dithering
Conclusion: Simplifying the Complex

What is Dithering in Audio?

Dithering adds low-level noise to a higher resolution digital audio signal before lowering its resolution for playback devices. This process reduces quantization error or truncation distortion. This might sound like gibberish, so let’s further explain what this all means. During recording and mixing, most DAWs set you up at a 32-or 24-bit depth. These numbers refer to the audio resolution, or how much information can fit into each sample. Recording in a higher bit depth produces a more dynamic and clear result. So, you might wonder why you need dithering. The answer lies in exporting and mastering. In fact, mastering involves dithering as the last step of the mastering process, and you should only add it when downsampling. This is frequently noted as a common mastering mistake when mastering your own music.

Most digital audio playback formats (such as CDs) yield only a 16-bit resolution. This means your 24- or 32-bit mix has to fit into a smaller space. You must export your digital audio to this format, but you don’t want to lose the detail and dynamics of your mix during conversion. Understanding audio dithering during mastering adds a low-level noise to the mix. This noise introduces quantization randomization. In other words, it prevents truncation distortion, which occurs when audio data at a higher bit depth rounds down to a lower one. Dithering allows the audio to maintain some of its dynamics during the transition from higher to lower resolution. Without dither, the song loses some of its dynamic range, and quieter moments (such as fade-ins and fade-outs) become noticeably crunched and digitized (see image below).

A Bit on Noise Shaping

This all might seem counterintuitive at first. After all, dithering leaves you with some hissing background noise as a trade-off for the aforementioned truncation distortion. However, this noise caused by dither can actually be reduced in the process via noise shaping. Noise shaping allows you to target the frequency of the low-level white noise and reduce its audible presence. These options sometimes come with names such as Triangular, Rectangular, and Pow-r, each offering different levels and shapes of dither. Regardless of your choice, you won’t remove all of this noise, as it’s still necessary when converting to a lower resolution. Still, you can try different options and shapes to find the best compromise.

Understand Dithering Using Images

Even with this explanation, many find it difficult to grasp the meaning of dithering in audio conceptually. Fortunately, various digital formats use dither, including visual media. Today, most of us are familiar with HD TVs, so let’s use them as an example. Millions of pixels make up the image on your TV screen. Each pixel (like an audio bit) holds a piece of visual information. When all this information combines, it creates a much larger, crisp image. Now, what happens if we take that detailed image information and cram it into a smaller, lower-quality space (i.e., one with fewer pixels)? Mathematically speaking, some of the information will get lost. The question is, which information, how much of it, and what does the resultant image look like?

This once high-quality image (left) now truncates into a smaller space (middle). Certain important pixels disappear while others blend together into a single blob. Suddenly, the image becomes unrecognizable. This is where dithering saves the day. What if we altered or scrambled the data instead of cramming the high-def image into a low-def screen, adding random pixel variation to the image first? Instead of destroying chunks of pixels or blocking them into rigid sections, we leave a random selection of pixels across the image behind. As a result, much of the digital image (or what our brains could process as the image) remains intact (right). The same amount of data has been removed, but this variation allows enough important detail to remain.

Types of Dithering

Dithering comes in various forms, each tailored to address specific needs in digital audio processing. While the goal is always to minimize distortion and improve sound quality during bit depth reduction, the approach varies depending on the type of dithering used. Factors like noise distribution, frequency response, and application context determine the most suitable option. Below, we break down the most common types of dithering, highlighting their unique characteristics and ideal use cases:

Rectangular Probability Density Function (RPDF) Dither: This type uses uniformly distributed noise, where all values within a specific range have the same probability of occurrence. It’s simple to implement but may not be as effective in eliminating audible distortions compared to other types.
Triangular Probability Density Function (TPDF) Dither: Generated by summing two independent RPDF sources, this dithering produces a triangular distribution where central values have a higher probability. It’s widely used due to its ability to minimize noise modulation and more effectively eliminate harmonic distortion.
Gaussian Dither: Characterized by a normal or bell-shaped distribution, this type resembles noise generated by analog sources like microphone preamplifiers. While useful in certain applications, it generally requires higher levels of added noise to completely eliminate audible distortions.
Noise-Shaped Dither: This advanced form involves filtering the dithering noise to redistribute its spectral energy, typically shifting it towards frequencies less perceptible to the human ear. It’s especially beneficial in applications aiming to maintain perceived audio quality, though its implementation is more complex.

Relationship Between Bit Depth and Sample Rate

The quality of digital audio is determined by two fundamental parameters: sample rate and bit depth. It’s important to note that while higher sample rates and bit depths can offer greater theoretical quality, they also lead to larger file sizes and increased processing demands. Therefore, finding an appropriate balance based on the specific needs of each project is essential.

Sample Rate

Refers to the number of audio samples taken per second, measured in kilohertz (kHz). According to the Nyquist sampling theorem, to accurately reproduce a signal, the sample rate must be at least twice the highest frequency present in the signal. For instance, a 44.1 kHz sample rate is sufficient to capture the human audible spectrum, which extends up to approximately 20 kHz.

Bit Depth

Indicates the number of bits used to represent the amplitude of each sample, directly affecting the audio’s dynamic range. A higher bit depth allows for a more precise representation of amplitude variations, resulting in more detailed sound with less quantization noise. For example, a 16-bit depth offers a dynamic range of 96 dB, while 24 bits provide up to 144 dB.

Some Facts about Dithering

Different Digital Audio Workstations (DAWs) utilize various dithering algorithms and offer distinct options for noise shaping. It’s crucial to understand how your specific DAW handles dithering and to select the appropriate settings that align with your project’s requirements.

Applying dithering multiple times can accumulate noise and degrade audio quality. It is best practice to apply dither only once, typically at the final stage of bit depth reduction during mastering, to preserve the integrity of the audio signal.

Even in high-resolution audio, dithering plays a vital role in minimizing quantization errors. It ensures that the audio maintains its clarity and detail, regardless of the bit depth or sample rate, by preventing the introduction of distortion during digital processing.

Common Myths About Dithering

Dithering is a fundamental technique in digital audio production, yet it is often surrounded by misconceptions that can lead to improper application and affect the quality of the final product. Understanding the truth behind these myths is essential for audio engineers and producers to make informed decisions during the mixing and mastering processes. Here are some common myths about dithering:

While dithering is commonly applied during bit depth reduction, such as from 24-bit to 16-bit, it is also beneficial when processing audio at higher bit depths. Dithering helps distribute quantization errors more evenly, maintaining audio transparency even at higher resolutions.

Dithering introduces controlled, low-level noise designed to mask quantization distortion. When implemented correctly, this noise is inaudible and serves to improve overall sound quality by preventing audible artifacts during bit depth reduction.

Conclusion: Simplifying the Complex

Defining dithering in audio presents challenges without swiftly diving into the technical aspects. If you understand the ins and outs of bit depth and sample rates, you shouldn’t struggle too much with understanding audio dithering. However, truly grasping the concept requires some abstraction, and most people not familiar with digital audio processing will only comprehend dithering in these abstract terms. Ironically, the process of dithering essentially aims to break down complex data into simpler, more digestible information.

So that’s dithering in audio! If you still feel confused, take some time to contemplate the concept in the abstract, and eventually, it will click. It also helps to explore and listen to examples of dithering online to solidify your understanding. If you use MasteringBOX to master your tracks, remember that the algorithm adds dithering when rendering in 16-bit resolution. Don’t waver, dither!

About the Author

Ethan Keeley

Writer, Voice Talent, Musician, and Audio Editor

Ethan Keeley is a musician, voiceover talent, and writer from Rochester, New York. When he's not on tour with his band Unwill he's working on new songs and stories.