7.4 The DFT, sampling, and aliasing
Real-world signals live in computers. An audio buffer is a finite array of numbers; a sampled microphone signal is a sequence of measurements at evenly-spaced times. The continuous Fourier transform of 7.2 is the right mathematical object for thinking about signals abstractly, but the computational object is the discrete Fourier transform (DFT), which acts on a finite sequence of samples.
This lesson develops the bridge from continuous to discrete. Three connected ideas: sampling theory (when do samples suffice to represent a signal?), aliasing (what goes wrong when they don’t?), and the DFT itself (how to compute the spectrum of a sampled signal).
Sampling a continuous signal
To represent a continuous signal in a computer, we sample it at evenly spaced times :
The sample rate is (in samples per second, often denoted Hz). For CD-quality audio, Hz; for telephony, 8000 Hz; for high-end pro audio, 96000 Hz or 192000 Hz.
The fundamental question: under what conditions can we recover the continuous signal from the discrete samples ? The answer is Shannon’s sampling theorem.
The Shannon sampling theorem
Theorem. If a signal contains no frequencies above (i.e. for ), then can be uniquely reconstructed from samples taken at any rate
where is the maximum frequency in Hz.
The critical sampling rate is the Nyquist rate. Half of is the Nyquist frequency. Sample any signal containing only frequencies below the Nyquist frequency, and the samples carry all the information needed to reconstruct the continuous signal.
Reconstruction is by the Whittaker–Shannon interpolation formula:
where is the normalised sinc function. This is the exact reconstruction when the signal is band-limited and the sampling theorem is satisfied. In practice approximate sinc-interpolation is used, since perfect sinc has infinite support.
Aliasing
What happens when the sampling rate is below the Nyquist rate — i.e. when the signal contains frequencies above ? The high-frequency content aliases down to a lower apparent frequency that the samples can represent.
Concretely: a sinusoid at frequency produces samples that are indistinguishable from samples of a sinusoid at the aliased frequency
for whichever integer makes the result fall in . The two signals — high-frequency original, low-frequency alias — produce identical sample sequences. Without additional information, the system has no way to distinguish them, and any reconstruction from the samples will produce the alias.
Sampling is adequate. The signal at f₀ = 3.0 Hz is below the Nyquist frequency fs/2 = 5.0 Hz. The samples uniquely determine the signal — the Shannon sampling theorem guarantees perfect reconstruction. Push f₀ above 5.0 Hz and watch the reconstructed alias diverge from the truth.
Drag the signal frequency past the Nyquist frequency . The original (light) and the aliased reconstruction (heavy) diverge; the dots (samples) sit on both curves equally. The bottom strip shows the frequency-domain picture: the spectral content at “folds” around the Nyquist line and reappears at .
Aliasing is not recoverable by any algorithm. Information about frequencies above the Nyquist is permanently lost in the sampling. The only defence is an anti-aliasing filter: an analog low-pass filter applied before sampling that removes content above the Nyquist frequency. Every analog-to-digital converter contains one.
The discrete Fourier transform (DFT)
The DFT acts on a finite sequence of samples and produces complex numbers :
The inverse:
The DFT is the Fourier transform restricted to a finite, sampled, periodic context. Notice the parallels:
- The continuous transform integrates over all time; the DFT sums over samples.
- The continuous transform produces a continuous function of ; the DFT produces values at discrete frequencies .
- The continuous transform applies to signals that go to zero at infinity; the DFT implicitly assumes the signal is periodic with period .
That last point is important. The DFT treats the -sample buffer as one period of a periodic signal. If you apply the DFT to a signal that isn’t really periodic (e.g. a sound clip with non-matching endpoints), the implicit periodisation produces a jump at the seam, which produces spurious high-frequency content in the spectrum — spectral leakage.
Windowing and spectral leakage
The standard fix for spectral leakage is to multiply the time-domain signal by a window function that smoothly tapers to zero at the buffer endpoints before applying the DFT. Common windows:
- Rectangular — the unwindowed case. Maximum spectral leakage.
- Hann — smooth cosine taper. Standard for spectral analysis.
- Hamming — minor variant of Hann with slightly better far-field rejection.
- Blackman — more aggressive taper for very-low-leakage spectra.
- Tukey, Kaiser, Gaussian, Bartlett — various trade-offs of main-lobe width against side-lobe level.
Choosing a window is the central design decision of spectrogram analysis. A narrow window gives sharp time resolution but blurry frequency; a wide window gives sharp frequency but blurry time. This is the uncertainty principle from 7.2, expressed in the design choice of .
The FFT
A naive DFT computation costs operations — for , about complex multiplications. The Fast Fourier Transform (FFT) algorithm computes the same DFT in operations, a factor-of- speedup. For that’s operations — six orders of magnitude faster.
The FFT is the algorithm behind every spectrogram, every audio plug-in, every MRI reconstruction, every JPEG-2000 compression. We develop the Cooley–Tukey radix-2 FFT in detail in Numerical Methods 10.5; here the relevant fact is that the FFT computes the DFT exactly (modulo floating-point roundoff), just faster.
A historical aside
The FFT was rediscovered by James Cooley and John Tukey in 1965, and the Cooley–Tukey paper is widely credited as launching modern digital signal processing. But the same algorithm had been written out by Gauss in 1805, in a notebook never published in his lifetime. Gauss wanted to fit a trigonometric series to astronomical data; the manuscript languished in his archives until the 20th century. The rediscovery was independent and far more consequential — Cooley and Tukey’s paper sparked a multi-decade build-out of digital audio, telecommunications, and imaging — but historically, Gauss had the algorithm 160 years earlier.
What we use this for
Discrete-time Fourier analysis powers nearly all signal-processing on the bookshelf:
- Spectrograms — short-time DFTs with overlapping windows. Sound 8.2.
- Sound spectrum analysis — pitch, timbre, formants, every spectral plot in Sound 8 and Hearing 4. The display is always the magnitude of a windowed DFT.
- Cochleagrams — the auditory analogue of a spectrogram, using a gammatone filterbank instead of a sliding DFT but with the same time-frequency-uncertainty trade-off. Hearing 4.7.
- Audio compression (MP3, AAC, Opus) — uses the modified discrete cosine transform (MDCT), a close cousin of the DFT.
- Digital filter design — design filter in the frequency domain, inverse-DFT to get , convolve with input. The convolution theorem of 7.3 implemented via FFT.
Closing the chapter
That closes Foundations 7. The Fourier transform is one mathematical idea applied at four different scales: as a series for periodic signals, as a transform for transients, as a convolution-theorem identity for linear-systems theory, and as a DFT for digital implementation. The four pictures are different views of the same underlying object — the eigenbasis of the differentiation operator, lifted to a function space — and the practical signal-processing world runs on all four simultaneously.