7.9 Reverberation as superposition

Above the Schroeder frequency, the discrete-mode picture gives way to a statistical one. A short impulse of sound (a clap, a click, a balloon pop) at a source position fills the room with a dense superposition of reflections from every surface. The sound arriving at a listener at any later moment is the convolution of the source signal with the room’s impulse response.

The room impulse response

Define hroom(τ)h_\text{room}(\tau) as the pressure at the listener’s position at time τ\tau after the source emits an impulsive δ(t)\delta(t) pulse at t=0t = 0. The listener’s pressure for an arbitrary source signal s(t)s(t) is

plistener(t)  =  hroom(tt)s(t)dt.p_\text{listener}(t) \;=\; \int_{-\infty}^\infty h_\text{room}(t - t')\, s(t')\, dt'.

This is the convolution plistener=hroomsp_\text{listener} = h_\text{room} * s. It is a complete linear description: knowing hroomh_\text{room} once, for one source–listener pair, lets you predict the listener’s signal for any source signal.

The impulse response itself has three parts:

  1. Direct sound: a single sharp peak at τ=r/c\tau = r/c (with rr the source–listener distance) — the original pulse arriving via the straight-line path.
  2. Early reflections: a series of discrete peaks at τ=ri/c\tau = r_i/c for each first-, second-, and few-bounce paths off the walls and ceiling. These typically arrive within 20–80 ms of the direct sound and the brain interprets them as part of the source (Haas precedence effect).
  3. Reverberant tail: a dense, exponentially decaying noise of overlapping later reflections, lasting hundreds of milliseconds. This is the late field.
The history — Sabine in the Fogg Lecture Room

Modern architectural acoustics began in 1895 at Harvard. Wallace Clement Sabine — a 26-year-old assistant professor of physics — was asked to fix the Fogg Art Museum’s new lecture hall, where speech was unintelligible because reverberation lasted nearly six seconds. Sabine had no acoustic training; he taught himself by experiment.

His protocol: at night, after the building had emptied, he carried seat cushions from a neighbouring lecture theatre into the Fogg’s lecture room, played a tone on an organ pipe, and timed (with a stopwatch and a sensitive ear) how long the sound was audible after the pipe stopped. He repeated this with different numbers of cushions — that is, different amounts of absorbing surface area — and looked for a pattern. After thousands of measurements over five years, he saw the relation TA=T \cdot A = constant times VV, and published the result in 1900 (Sabine 1900).

The constant 0.161 in T60=0.161V/AT_{60} = 0.161\,V/A (in SI units) traces back to Sabine’s stopwatch measurements at Harvard. The Fogg lecture room, once fixed, became the prototype for acoustic design of every concert hall built since. Sabine went on to consult on Boston’s Symphony Hall (opened 1900), which remains one of the finest-sounding concert halls in the world — a direct application of the formula he had derived with seat cushions and patience.

Sabine’s reverberation time

For the reverberant tail, the energy density decays exponentially:

E(t)    et/τE,E(t) \;\propto\; e^{-t/\tau_E},

with a time constant τE\tau_E depending on room volume and absorption. Sabine’s formula (1898) for the reverberation time — the time for EE to drop by 60 dB — is

T60  =  0.161VA,T_{60} \;=\; \frac{0.161\, V}{A},

with VV in m³ and AA the total absorption area (sum over each surface of area times its absorption coefficient, in units of m²·sabin). The dimensionless constant 0.161 = 4ln(106)/c4 \ln(10^6) / c ≈ 0.161 s·m⁻¹ comes from energy balance in a diffuse field.

Typical T60T_{60} values:

Each value reflects a deliberate design choice. A speech room wants short T60T_{60} to keep consonants intelligible. A music room wants longer T60T_{60} to support sustained tones and add warmth. A cathedral aims for a particular sense of space.

The convolution structure

The reverberant tail of hroomh_\text{room}, viewed as a function of frequency, is the transfer function of the room. It’s stochastic — different at every listener position, every source position, every frequency near the modal-overlap regime. But its statistical properties (decay time, spectrum) are uniform across the room and predictable from the volume and absorption.

For audio recording and synthesis, modern convolution reverbs record the impulse response of a real space and convolve it with arbitrary source material. Apply Carnegie Hall’s impulse response to a dry recording, and the recording sounds as if it were performed in Carnegie Hall — because the reverberant content of “performing in Carnegie Hall” is the convolution of the performance with the hall’s hroomh_\text{room}.

Diffuse-field energy and the steady state

For a steady source radiating PP watts into a room of volume VV with total absorption AA, the steady-state energy density in the reverberant field is

E  =  4PcA.E \;=\; \frac{4 P}{c A}.

(Steady source supplies energy at rate PP; absorption removes energy at rate cAE/4c A E / 4 for a 3-D diffuse field; balance.) The corresponding pressure level is the reverberant pressure level, and it falls off only with the absorption (not with distance from source) once you’re well outside the critical distance — the radius at which the direct field equals the reverberant field.

What we have built

Chapter 7 in summary:

Each lesson is a different consequence of the same wave equation we derived in chapter 4. The boundaries vary; the equation does not.

Next chapter: the frequency picture, where we Fourier-decompose all of this into independent oscillating modes and re-derive most of the chapter from the other side.