7.4 The DFT, sampling, and aliasing

Real-world signals live in computers. An audio buffer is a finite array of numbers; a sampled microphone signal is a sequence of measurements at evenly-spaced times. The continuous Fourier transform of 7.2 is the right mathematical object for thinking about signals abstractly, but the computational object is the discrete Fourier transform (DFT), which acts on a finite sequence of samples.

This lesson develops the bridge from continuous to discrete. Three connected ideas: sampling theory (when do samples suffice to represent a signal?), aliasing (what goes wrong when they don’t?), and the DFT itself (how to compute the spectrum of a sampled signal).

Sampling a continuous signal

To represent a continuous signal f(t)f(t) in a computer, we sample it at evenly spaced times tn=nΔtt_n = n \Delta t:

fn    f(nΔt),n=0,1,2,,N1.f_n \;\equiv\; f(n \Delta t), \qquad n = 0, 1, 2, \ldots, N - 1.

The sample rate is fs=1/Δtf_s = 1/\Delta t (in samples per second, often denoted Hz). For CD-quality audio, fs=44100f_s = 44100 Hz; for telephony, 8000 Hz; for high-end pro audio, 96000 Hz or 192000 Hz.

The fundamental question: under what conditions can we recover the continuous signal f(t)f(t) from the discrete samples {fn}\{f_n\}? The answer is Shannon’s sampling theorem.

The Shannon sampling theorem

Theorem. If a signal f(t)f(t) contains no frequencies above ωmax\omega_{\max} (i.e. f~(ω)=0\tilde f(\omega) = 0 for ω>ωmax|\omega| > \omega_{\max}), then ff can be uniquely reconstructed from samples taken at any rate

fs    ωmaxπ  =  2fmax,f_s \;\geq\; \frac{\omega_{\max}}{\pi} \;=\; 2\, f_{\max},

where fmax=ωmax/(2π)f_{\max} = \omega_{\max} / (2\pi) is the maximum frequency in Hz.

The critical sampling rate fs=2fmaxf_s = 2 f_{\max} is the Nyquist rate. Half of fsf_s is the Nyquist frequency. Sample any signal containing only frequencies below the Nyquist frequency, and the samples carry all the information needed to reconstruct the continuous signal.

Reconstruction is by the Whittaker–Shannon interpolation formula:

f(t)  =  nfnsinc ⁣(tnΔtΔt),f(t) \;=\; \sum_{n} f_n\, \mathrm{sinc}\!\left( \frac{t - n \Delta t}{\Delta t} \right),

where sinc(x)=sin(πx)/(πx)\mathrm{sinc}(x) = \sin(\pi x) / (\pi x) is the normalised sinc function. This is the exact reconstruction when the signal is band-limited and the sampling theorem is satisfied. In practice approximate sinc-interpolation is used, since perfect sinc has infinite support.

Aliasing

What happens when the sampling rate is below the Nyquist rate — i.e. when the signal contains frequencies above fs/2f_s / 2? The high-frequency content aliases down to a lower apparent frequency that the samples can represent.

Concretely: a sinusoid at frequency f0>fs/2f_0 > f_s / 2 produces samples that are indistinguishable from samples of a sinusoid at the aliased frequency

falias  =  f0nfsf_\text{alias} \;=\; |f_0 - n f_s|

for whichever integer nn makes the result fall in [0,fs/2][0, f_s/2]. The two signals — high-frequency original, low-frequency alias — produce identical sample sequences. Without additional information, the system has no way to distinguish them, and any reconstruction from the samples will produce the alias.

time domain — signal at f₀ = 3.0 Hz, sampled at fs = 10.0 Hzt (s)Nyquist fs/2 = 5.0f₀ = 3.0frequency (Hz)

Sampling is adequate. The signal at f₀ = 3.0 Hz is below the Nyquist frequency fs/2 = 5.0 Hz. The samples uniquely determine the signal — the Shannon sampling theorem guarantees perfect reconstruction. Push f₀ above 5.0 Hz and watch the reconstructed alias diverge from the truth.

Drag the signal frequency f0f_0 past the Nyquist frequency fs/2f_s / 2. The original (light) and the aliased reconstruction (heavy) diverge; the dots (samples) sit on both curves equally. The bottom strip shows the frequency-domain picture: the spectral content at f0f_0 “folds” around the Nyquist line and reappears at faliasf_\text{alias}.

Aliasing is not recoverable by any algorithm. Information about frequencies above the Nyquist is permanently lost in the sampling. The only defence is an anti-aliasing filter: an analog low-pass filter applied before sampling that removes content above the Nyquist frequency. Every analog-to-digital converter contains one.

The discrete Fourier transform (DFT)

The DFT acts on a finite sequence of NN samples {xn}n=0N1\{x_n\}_{n=0}^{N-1} and produces NN complex numbers {Xk}k=0N1\{X_k\}_{k=0}^{N-1}:

  Xk  =  n=0N1xnei2πkn/N,k=0,1,,N1.  \boxed{\;X_k \;=\; \sum_{n=0}^{N-1} x_n\, e^{-i\, 2\pi k n / N}, \qquad k = 0, 1, \ldots, N-1.\;}

The inverse:

xn  =  1Nk=0N1Xkei2πkn/N.x_n \;=\; \frac{1}{N} \sum_{k=0}^{N-1} X_k\, e^{i\, 2\pi k n / N}.

The DFT is the Fourier transform restricted to a finite, sampled, periodic context. Notice the parallels:

That last point is important. The DFT treats the NN-sample buffer as one period of a periodic signal. If you apply the DFT to a signal that isn’t really periodic (e.g. a sound clip with non-matching endpoints), the implicit periodisation produces a jump at the seam, which produces spurious high-frequency content in the spectrum — spectral leakage.

Windowing and spectral leakage

The standard fix for spectral leakage is to multiply the time-domain signal by a window function w(t)w(t) that smoothly tapers to zero at the buffer endpoints before applying the DFT. Common windows:

Choosing a window is the central design decision of spectrogram analysis. A narrow window gives sharp time resolution but blurry frequency; a wide window gives sharp frequency but blurry time. This is the uncertainty principle from 7.2, expressed in the design choice of ww.

The FFT

A naive DFT computation costs O(N2)\mathcal{O}(N^2) operations — for N=106N = 10^6, about 101210^{12} complex multiplications. The Fast Fourier Transform (FFT) algorithm computes the same DFT in O(NlogN)\mathcal{O}(N \log N) operations, a factor-of-NN speedup. For N=106N = 10^6 that’s 2×1072 \times 10^7 operations — six orders of magnitude faster.

The FFT is the algorithm behind every spectrogram, every audio plug-in, every MRI reconstruction, every JPEG-2000 compression. We develop the Cooley–Tukey radix-2 FFT in detail in Numerical Methods 10.5; here the relevant fact is that the FFT computes the DFT exactly (modulo floating-point roundoff), just faster.

A historical aside

The FFT was rediscovered by James Cooley and John Tukey in 1965, and the Cooley–Tukey paper is widely credited as launching modern digital signal processing. But the same algorithm had been written out by Gauss in 1805, in a notebook never published in his lifetime. Gauss wanted to fit a trigonometric series to astronomical data; the manuscript languished in his archives until the 20th century. The rediscovery was independent and far more consequential — Cooley and Tukey’s paper sparked a multi-decade build-out of digital audio, telecommunications, and imaging — but historically, Gauss had the algorithm 160 years earlier.

What we use this for

Discrete-time Fourier analysis powers nearly all signal-processing on the bookshelf:

Closing the chapter

That closes Foundations 7. The Fourier transform is one mathematical idea applied at four different scales: as a series for periodic signals, as a transform for transients, as a convolution-theorem identity for linear-systems theory, and as a DFT for digital implementation. The four pictures are different views of the same underlying object — the eigenbasis of the differentiation operator, lifted to a function space — and the practical signal-processing world runs on all four simultaneously.