4.3 The wave on the membrane

The independent-oscillator model of 4.2 was an approximation. The places are coupled: by the perilymph above the membrane (in scala vestibuli) and below it (in scala tympani). When the stapes pushes at the oval window, the fluid carries a pressure disturbance to every place along the BM at once, and the BM responds along its whole length together. The result is a wave that travels along a structure whose properties change as it goes, slows down as it approaches its preferred place, builds to a peak there, and dies out beyond.

Georg von Békésy was the first to see it directly, by carefully sectioning cadaveric cochleas and watching how the basilar membrane moved under controlled acoustic stimulation. He won the Nobel Prize in Physiology or Medicine for the work in 1961. What he saw — and what we are about to model — is the cochlear traveling wave.

The basilar-membrane impedance

To get a handle on the math, we begin by re-expressing the local oscillator equation of 4.2 in frequency-domain form. The relationship between local pressure $p(x, \omega)$ across the membrane and local transverse velocity $v(x, \omega) = -i\omega\,\eta$ is

p(x, \omega) = Z_{\text{BM}}(x, \omega)\, v(x, \omega), \qquad Z_{\text{BM}}(x, \omega) = b(x) + i\!\left(\omega\, m(x) - \frac{k(x)}{\omega}\right).

This $Z_{\text{BM}}$ is the specific acoustic impedance of the basilar membrane — pressure per unit velocity.

▶ Derivation: from the SHM equation to the BM impedance Derivation

Start with the time-domain equation from 4.2:

m\,\ddot\eta + b\,\dot\eta + k\,\eta = p.

Go to the frequency domain by assuming sinusoidal time dependence $\eta(t) = \tilde\eta\, e^{-i\omega t}$ (we use the $e^{-i\omega t}$ convention so that the velocity $v = \dot\eta = -i\omega \tilde\eta\, e^{-i\omega t}$ leads to clean expressions; the choice doesn’t matter for the final magnitude). Then $\dot\eta = -i\omega\eta$ and $\ddot\eta = -\omega^2\eta$ .

Substituting:

(-\omega^2 m - i\omega b + k)\,\tilde\eta = \tilde p.

Now express in terms of velocity instead of displacement, using $\tilde v = -i\omega\tilde\eta$ , so $\tilde\eta = \tilde v/(-i\omega) = i\tilde v/\omega$ :

(-\omega^2 m - i\omega b + k)\,\frac{i\tilde v}{\omega} = \tilde p.

Multiply out the $i/\omega$ :

\tilde p = \left(-i\omega m + b - \frac{ik}{\omega}\right)\,\tilde v = \left[b + i\!\left(\frac{k}{\omega} - \omega m\right)\right]\,\tilde v.

Wait — there’s a sign convention to pin down. With $e^{-i\omega t}$ , what we usually call “the impedance” requires the imaginary part to flip relative to what I just wrote (the two time-conventions differ by complex conjugation throughout). With the alternative $e^{+i\omega t}$ convention, the impedance is

Z_\text{BM}(x, \omega) = b + i\!\left(\omega m - \frac{k}{\omega}\right),

which is the form used throughout. (The two conventions are equally valid; pick one and stick with it. Acoustics texts commonly use $e^{+i\omega t}$ , the convention adopted here.)

The imaginary part of $Z_\text{BM}$ is the reactance. At $\omega < \omega_0 = \sqrt{k/m}$ , the term $k/\omega$ dominates and the reactance is negative — the membrane is stiffness-controlled (it resists like a spring). At $\omega = \omega_0$ , the reactance vanishes — the membrane is resistive, with $Z = b$ . At $\omega > \omega_0$ , $\omega m$ dominates and the reactance is positive — the membrane is mass-controlled (it resists like a heavy load).

The imaginary part of $Z_\text{BM}$ is stiffness-dominated when $\omega < \omega_0(x)$ (basal of the characteristic place), purely damping when $\omega = \omega_0(x)$ , and mass-dominated when $\omega > \omega_0(x)$ (apical of the characteristic place). This is the same impedance you derived from the SHM equation in 4.2, just written in pressure-and-velocity form.

The long-wave cochlear wave equation

Now we couple the local impedance to the fluid. Take the perilymph in scala vestibuli and scala tympani to be incompressible with density $\rho$ , and treat each scala as having a roughly constant cross-sectional area $A$ . The two scalae are connected to each other across the basilar membrane (and at the helicotrema at the apex, but we’ll handle that as a boundary condition). The cochlear long-wave equation for the pressure difference $p(x, t)$ across the membrane is

\frac{\partial^2 p}{\partial x^2} = -\frac{2\rho}{A}\, \frac{\partial v_{\text{BM}}}{\partial t}.

Substituting $v_\text{BM} = p/Z_\text{BM}$ and going to the frequency domain by writing $p(x, t) = P(x)\, e^{i\omega t}$ gives

\frac{d^2 P}{dx^2} + \kappa^2(x, \omega)\, P = 0, \qquad \kappa^2(x, \omega) = \frac{2 i \omega\, \rho}{A\; Z_{\text{BM}}(x, \omega)}.

(We use $\kappa$ for the wavenumber to avoid clashing with the stiffness $k$ .)

▶ Derivation: the cochlear long-wave equation from mass conservation Derivation

The cochlea is a tube. Two parallel chambers (scala vestibuli above, scala tympani below) run side by side along the cochlea, separated by the basilar membrane. The fluid in each is incompressible.

Let $u(x, t)$ be the longitudinal volume velocity in scala vestibuli (m³/s) — fluid flowing rightward, toward the apex. Mass conservation in an incompressible fluid says that any fluid entering an element of the chamber must leave; if the chamber’s cross-section $A$ is constant, the longitudinal flow can only change if fluid is leaving the chamber sideways — that is, if the basilar membrane is moving.

Consider an element of scala vestibuli between $x$ and $x + dx$ , of width $w(x)$ (the basilar-membrane width at this point). The rate at which fluid crosses the BM out of scala vestibuli, per unit length, is $w(x)\, v_\text{BM}(x, t)$ (BM transverse velocity times its width). Conservation of mass in the chamber gives

\frac{\partial u}{\partial x} = -w(x)\, v_\text{BM}(x, t).

By symmetry, the volume velocity in scala tympani has the opposite sign. The pressure gradient drives the volume velocity through Newton’s second law applied to fluid (the Euler equation for inviscid fluid):

\rho_0\, \frac{\partial u/A}{\partial t} = -\frac{\partial p_V}{\partial x},

where $p_V$ is the pressure in scala vestibuli and $\rho_0$ is the perilymph density. Doing the same for scala tympani (with $p_T$ ) and subtracting gives an equation for the pressure difference $p = p_V - p_T$ :

\rho_0\, \frac{2}{A}\, \frac{\partial u}{\partial t} = -\frac{\partial p}{\partial x},

where the factor of 2 comes from the two scalae both contributing.

Differentiating with respect to $x$ and substituting $\partial u/\partial x = -w v_\text{BM}$ :

\frac{\partial^2 p}{\partial x^2} = -\frac{2\rho_0}{A}\, \frac{\partial}{\partial t}\left(-w v_\text{BM}\right) \cdot (-1) = \frac{2 \rho_0\, w}{A}\, \frac{\partial v_\text{BM}}{\partial t}.

If $w$ is roughly constant or absorbed into the definition of impedance, and we drop the explicit width, we recover

\frac{\partial^2 p}{\partial x^2} = -\frac{2\rho}{A}\, \frac{\partial v_\text{BM}}{\partial t}

(the sign convention is set by the direction of $v_\text{BM}$ relative to pressure increase). Substituting $v_\text{BM} = p/Z_\text{BM}$ and going to frequency-domain ( $\partial/\partial t \to i\omega$ with the $e^{+i\omega t}$ convention):

\frac{d^2 P}{dx^2} = -\frac{2\rho}{A}\, \frac{i\omega P}{Z_\text{BM}}.

This is the wave equation $d^2 P/dx^2 + \kappa^2 P = 0$ with

\kappa^2(x, \omega) = \frac{2 i \omega\, \rho}{A\; Z_\text{BM}(x, \omega)}.

This is the cochlear dispersion relation. ∎

This derivation is simplified — it elides the detailed treatment of the basilar-membrane width $w(x)$ , the geometry of the scalae, and the precise boundary conditions at the stapes and helicotrema. For a careful treatment, see Lighthill, Waves in Fluids (1978), chapters 4–5; or Steele & Taber (1979) for the original modern long-wave derivation; or chapters 6–8 of Pickles’ Introduction to the Physiology of Hearing.

This is the cochlear dispersion relation, and reading it carefully tells the entire story of the traveling wave.

Look at what $\kappa^2$ does in three regions of $x$ :

For $x$ basal of the characteristic place — that is, $\omega < \omega_0(x)$ — the membrane’s impedance is dominated by its stiffness, $Z_{\text{BM}}$ is large, and $\kappa$ is small. The wavelength $2\pi/|\mathrm{Re}(\kappa)|$ is long. The wave propagates freely and quickly. This is the long-wave region.
At the characteristic place $x_{\text{CF}}(\omega)$ , where $\omega = \omega_0(x)$ , the impedance has its smallest magnitude — only the damping $b$ remains. $|\kappa^2|$ becomes large and acquires comparable real and imaginary parts. The wavelength collapses, the phase velocity goes to zero, and the wave’s energy density piles up at this location. The analogy that always helps me here is a long ocean swell rolling onto a shoaling beach: as the depth decreases, the wave slows, the wavelength shrinks, and the wave’s amplitude builds. The same thing is happening in your cochlea right now.
Past the characteristic place — $\omega > \omega_0(x)$ — the impedance becomes mass-dominated, and $\kappa$ rotates into the regime where its imaginary part dominates. The carrier dies. This is the evanescent region, and it is what makes the cochlea such a sharp filter: the wave cannot continue past its characteristic place.

WKB and the traveling wave

In the WKB approximation (valid because the cochlea’s properties vary slowly on the wavelength scale, at least in the long-wave region), the solution to the wave equation looks like

P(x) \approx \frac{P_0}{\sqrt{\kappa(x, \omega)}}\, \exp\!\left( i \int_0^x \kappa(x', \omega)\, dx' \right).

▶ Derivation: the WKB approximation Derivation

The exact wave equation $d^2 P/dx^2 + \kappa^2(x) P = 0$ has no closed-form solution when $\kappa$ varies with $x$ . But when $\kappa$ varies slowly — meaning $|d\kappa/dx| \ll \kappa^2$ — we can find an approximate solution.

Guess $P(x) = A(x)\, e^{i\Phi(x)}$ where $A(x)$ is a slowly varying amplitude and $\Phi(x)$ is the phase. Take derivatives:

\frac{dP}{dx} = (A' + i A \Phi')\, e^{i\Phi},

\frac{d^2 P}{dx^2} = (A'' + 2i A' \Phi' + i A \Phi'' - A (\Phi')^2)\, e^{i\Phi}.

Substitute into the wave equation:

A'' + 2i A' \Phi' + i A \Phi'' - A (\Phi')^2 + \kappa^2 A = 0.

Separate real and imaginary parts:

Real: $A'' - A (\Phi')^2 + \kappa^2 A = 0$ Imag: $2 A' \Phi' + A \Phi'' = 0$

For slowly-varying $A$ , drop $A''$ compared to $A\kappa^2$ . The real part gives $(\Phi')^2 = \kappa^2$ , so $\Phi' = \kappa$ (choosing the right-going branch), and $\Phi(x) = \int_0^x \kappa(x') dx'$ .

The imaginary part can be rewritten as $(A^2 \Phi')' = 0$ , so $A^2 \Phi'$ is constant. Since $\Phi' = \kappa$ , $A^2 \kappa$ is constant, so $A \propto 1/\sqrt{\kappa}$ .

Combining, the WKB approximation is

P(x) \approx \frac{P_0}{\sqrt{\kappa(x)}}\, \exp\!\left(i \int_0^x \kappa(x') dx'\right).

The phase $\int \kappa dx$ tracks the cumulative phase accumulation; the prefactor $1/\sqrt{\kappa}$ comes from energy conservation (where the wave slows, the amplitude of pressure grows in just the right way to keep the energy flux constant). ∎

The exponential carries a phase that accumulates along the wave’s journey from the stapes; the $1/\sqrt{\kappa}$ prefactor is the slowly-varying envelope. The basilar-membrane displacement is then $\eta = v_{\text{BM}}/(i\omega) = P/(i\omega\, Z_{\text{BM}})$ , and the displacement amplitude $|\eta(x)|$ peaks sharply where $|Z_{\text{BM}}|$ is smallest — at $x_{\text{CF}}$ . The peak’s height scales with $Q$ . Its width scales with $1/Q$ . We are back to the resonance curve of 4.2, but now it has been painted spatially along the cochlea.

The interactive below renders the traveling wave for one input frequency at a time, using a simplified version of the cochlear long-wave model. Drag the slider. Watch the peak migrate along the basilar membrane; watch the wavelength shrink as the wave approaches the characteristic place; watch the wave die past it.

input frequency f

1 kHz

characteristic place: 15.2 mm from stapes
Q (fixed): 8

A few things worth flagging.

First, the model in the interactive is the passive cochlear wave — what you would observe in a dead cochlea, or in any cochlea at very high stimulus levels where the active feedback I will introduce in 4.5 has saturated. Real living cochleas show a much sharper, taller, more asymmetric peak at much lower stimulus levels, because of the outer hair cells. The passive wave tells us where the peak will be and roughly what shape it has. 4.5 will tell us how that peak gets sharpened.

Second, the wave’s phase accumulates as it travels. At the characteristic place, the phase has lagged the input by several full cycles — perhaps five to ten, depending on frequency. This will matter in movement 6: when the auditory nerve fires in phase-locked synchrony to the BM motion, the spikes inherit this phase lag, and the brain can in principle use the inter-channel timing for fine localization and pitch.

Third, and most importantly, this is the engine of place coding for real sounds. Send in a complex stimulus like “Hey Dr. Miles!” and the cochlea performs a near-instantaneous Fourier analysis: each frequency component sets up its own traveling wave with its own characteristic place. The /h/ aspiration’s broadband energy lights up the basal half of the BM. The /m/ nasal resonance lights up a band around 1 kHz. The diphthong /aɪ/ in Miles has formants in the range 600 to 2200 Hz that paint stripes further along the membrane. By the time the phrase has played, the entire basilar membrane has been mapped — in space, in time, and in amplitude — into a 35-mm-long picture of what was said. We will look at that picture, in full, at the end of movement 5.

Section 4.4 will write down the precise map from frequency to characteristic place — the Greenwood function. With that, we will know exactly what the cochlea sends downstream.

⏳ The history — Békésy and the traveling wave

Georg von Békésy, a Hungarian physicist working at the Budapest telephone exchange, began his cochlear experiments in 1928 with a practical question: what limits the frequency range of telephone communication? His approach was direct and physical — he built large-scale mechanical models of the cochlea, then moved to cadaveric human cochleas observed under stroboscopic illumination. By the late 1940s he had shown that sound entering the cochlea produces a traveling wave on the basilar membrane: a displacement pattern that propagates from base to apex, grows in amplitude as it approaches the place tuned to the stimulus frequency, peaks sharply there, and dies out beyond.

The traveling wave replaced Helmholtz’s resonance theory of independent fibers with a hydrodynamic picture: the membrane and the fluid are coupled, and the wave’s behavior is set by the position-dependent impedance of the membrane. Békésy received the Nobel Prize in Physiology or Medicine in 1961 — the only physicist to win in that category. His measurements, made on cadaveric cochleas with passive mechanics, showed broad tuning; the sharp frequency selectivity of the living cochlea would require the discovery of the cochlear amplifier two decades later.