We are all familiar with the two digital audio file descriptors sample rate and bit depth, and though these specifications seem routine I often get questions from producers and engineers about the optimum settings for a given project. This article covers the basics and best practices for setting sample rates. In another article, we cover the ins and outs of bit depth.
Sample Rate Defined
Sample rate defines how many times per second we sample, or take a measurement of, an analog audio signal as it is converted into a digital signal. The sample rate also defines the high-frequency response of an audio recording. The Nyquist Theorem states that the highest audio frequency we can record is half of the frequency of the sampling rate. This means that with a sample rate of 44.1 kHz, we can record audio signals up to 22.05 kHz. Likewise, a 96 kHz sample rate allows for 48 kHz of audio bandwidth.
If we attempt to record audio frequencies above half the sample rate (also called the Nyquist Frequency), audible artifacts called aliasing can occur. Audio-frequency aliasing is much like the wagon-wheel effect seen in videos. When we film a wheel with spokes that starts to spin and its rotational speed increases, it begins to look like it slows and then spins backward. This effect happens when the wheel’s speed approaches the frame rate of the video. Audio aliases are frequencies that are reflected below the Nyquist frequency and sound like strange non-musical harmonics.
Analog to digital converters can apply a low-pass filter before sampling so that no audio above the Nyquist frequency enters the A-D converter. This low-pass filter is referred to as an anti-aliasing filter. Unfortunately, low-pass filters have some side effects. If we apply a gentle low-pass filter to eliminate everything above, say 22 kHz, we will also slightly reduce the level of audio as much as an octave below 22 kHz, or 11 kHz. To avoid that, we could use a very steep low-pass filter, but steep filters create audible artifacts like ringing or phase shifts. Most modern A-D converters and interfaces actually sample at a very high sample rate and then downsample to the chosen sample rate to avoid the problems created by analog low-pass filters. The important thing to remember is that your system won’t record audio frequencies above half of the sample rate.
We know that human hearing reaches from about 20Hz to 20 kHz, so why would we need sampling rates above 44.1 kHz? One answer is that many people, including scientists, claim that humans can perceive sounds as high as 50 kHz through bone conduction. That claim may theoretically be correct, but through air humans only hear up to about 20 kHz, so in a perfect world 20 kHz would be all the frequency range needed by humans. A more practical reason for different sample rates is that Interfaces, A-D converters, and even plugins may sound different at different sample rates, depending on their architecture and how they deal with aliasing.
For example, plugins like limiters can create new high harmonics (distortion) that reach above the Nyquist Frequency and could cause additional audible distortions or colorations, so many plugins internally oversample the audio, or process the audio at multiple of the session’s sample rate so that any new high-frequency content does not create aliasing. The plugins then filter out the unwanted high frequencies and reduce the processed audio back to the original sample rate. Oversampling comes at the expense of CPU power and latency but can produce less colored processing. Next time you put a maximizer on your mix, audition the plugin with and without oversampling turned on to see if you hear a difference. Some processors do not allow the user to choose the oversampling and we just have to listen carefully to decide if we like the resulting audio. Oversampling is especially important for plugins like compressors, limiters, saturators, and exciters which inherently create harmonic distortion.
Standard Sample Rates
I have surveyed many professional producers, mixers, and mastering engineers, and most commercial top-40 records are recorded, mixed, and mastered at 44.1 kHz or 48 kHz. Professional studios use high-quality converters which sound great at all sample rates and the main reason to stick to 44.1 or 48 kHz sampling is simply to conserve CPU power when mixing and processing. Many pop songs contain hundreds of audio tracks and high sample rates could limit the ability to use CPU-intensive plugins. Music distributed on CD is 44.1 kHz while music embedded in video is usually 48 kHz, so both formats are popular and completely acceptable.
High Sampling Rates
For audiophile recordings and sound design projects, I recommend the 96 kHz sample rate, mainly for practical reasons. First, this sample rate eliminates audible high-frequency aliasing and filter-induced distortions from A-D conversion and plugins and avoids the user having to decide when to turn on oversampling. Second, it may be surprising to learn that 96 kHz audio files also provide lower processing latency. Plugin latency is based on a certain number of samples regardless of sample rate, so at higher sample rates a given number of samples goes by quicker than at lower sample rates. This is why digital consoles for live sound often operate at 96 kHz.
As an added benefit to sound designers, 96 kHz allows audio to be pitch-shifted an octave down and still retain some high-frequency content—provided your mic and recording chain can capture up to at least 40 kHz. Imagine recording a sword fight for a video game or movie and pitch-shifting the metallic sounds down to a lower frequency to exaggerate their impact. If you were limited to 22 kHz audio (44.1 kHz sample rate) and shifted the recording down by an octave, there would be no audio left above about 10 kHz, so the pitch-shifted audio may sound dull or filtered.
Best Practices for Sample Rate
If you feel the need to record at sample rates above 96 kHz, you should spend a considerable amount of time auditioning your converters, DAW, and plugins to find a workflow that suits your purpose. Sample rates at or above 96 kHz will tax your CPU, may reduce your track count, and provide fewer plugin choices. I would generally recommend against working at 176 kHz or 192 kHz unless you have truly studied the pros and cons of those high sampling rates. For reference, the Grammy’s Recording Academy Recommendations for Hi-Resolution Music Production document proposes a minimum sample rate of 48 kHz and a preferred sample rate of 96 kHz for hi-res audio production and delivery.
Surprisingly, some well-regarded plugins only process audio up to 48 kHz, so in a session with a higher sample rate, any audio passed through these plugins will result in the audio’s frequency response being reduced. If you wish to work at sample rates of 96 kHz or higher, check with your favorite plugin manufacturers to see if they report any limitations.
Sample Rate Conversion
Sometimes sample rate conversion is unavoidable and many software utilities provide excellent sample rate conversion. From a recent survey of mastering engineers, here are some recommended sample rate conversion programs: Voxengo r8brain, Weiss Saracon, Pro Tools SRC (using its TweakHead settings), Izotope RX10 Resample, and the command line utility SoX. Many other programs provide excellent results and DAWs are continually improving their SRC algorithms.
Keep these issues in mind and choosing the best sample rate for your project shouldn’t be too difficult. Here is a cheat sheet to help keep it all organized:
Continue reading other tips and tricks on our blog.