4.3 The wave on the membrane
In 4.2 I confessed that treating each place on the basilar membrane as an independent oscillator was a lie. The places are coupled: by the perilymph above the membrane (in scala vestibuli) and below it (in scala tympani). When the stapes pushes at the oval window, the fluid carries a pressure disturbance to every place along the BM at once, and the BM responds along its whole length together. The result is one of the most beautiful objects in classical mechanics: a wave that travels along a structure whose properties change as it goes, slows down as it approaches its preferred place, builds to a peak there, and dies out beyond.
Georg von Békésy was the first to see it directly, by carefully sectioning cadaveric cochleas and watching how the basilar membrane moved under controlled acoustic stimulation. He won the Nobel Prize in Physiology or Medicine for the work in 1961. What he saw — and what we are about to model — is the cochlear traveling wave.
The basilar-membrane impedance
To get a handle on the math, we begin by re-expressing the local oscillator equation of 4.2 in frequency-domain form. The relationship between local pressure across the membrane and local transverse velocity is
This is the specific acoustic impedance of the basilar membrane — pressure per unit velocity.
▶ Derivation: from the SHM equation to the BM impedance
Start with the time-domain equation from 4.2:
Go to the frequency domain by assuming sinusoidal time dependence (we use the convention so that the velocity leads to clean expressions; the choice doesn’t matter for the final magnitude). Then and .
Substituting:
Now express in terms of velocity instead of displacement, using , so :
Multiply out the :
Wait — there’s a sign convention to pin down. With , what we usually call “the impedance” requires the imaginary part to flip relative to what I just wrote (the two time-conventions differ by complex conjugation throughout). With the alternative convention, the impedance is
which is the form I’ll use throughout. (The two conventions are equally valid; pick one and stick with it. Acoustics texts commonly use , so I’ll match that.)
The imaginary part of is the reactance. At , the term dominates and the reactance is negative — the membrane is stiffness-controlled (it resists like a spring). At , the reactance vanishes — the membrane is resistive, with . At , dominates and the reactance is positive — the membrane is mass-controlled (it resists like a heavy load).
The imaginary part of is stiffness-dominated when (basal of the characteristic place), purely damping when , and mass-dominated when (apical of the characteristic place). This is the same impedance you derived from the SHM equation in 4.2, just written in pressure-and-velocity form.
The long-wave cochlear wave equation
Now we couple the local impedance to the fluid. Take the perilymph in scala vestibuli and scala tympani to be incompressible with density , and treat each scala as having a roughly constant cross-sectional area . The two scalae are connected to each other across the basilar membrane (and at the helicotrema at the apex, but we’ll handle that as a boundary condition). The cochlear long-wave equation for the pressure difference across the membrane is
Substituting and going to the frequency domain by writing gives
(We use for the wavenumber to avoid clashing with the stiffness .)
▶ Derivation: the cochlear long-wave equation from mass conservation
The cochlea is a tube. Two parallel chambers (scala vestibuli above, scala tympani below) run side by side along the cochlea, separated by the basilar membrane. The fluid in each is incompressible.
Let be the longitudinal volume velocity in scala vestibuli (m³/s) — fluid flowing rightward, toward the apex. Mass conservation in an incompressible fluid says that any fluid entering an element of the chamber must leave; if the chamber’s cross-section is constant, the longitudinal flow can only change if fluid is leaving the chamber sideways — that is, if the basilar membrane is moving.
Consider an element of scala vestibuli between and , of width (the basilar-membrane width at this point). The rate at which fluid crosses the BM out of scala vestibuli, per unit length, is (BM transverse velocity times its width). Conservation of mass in the chamber gives
By symmetry, the volume velocity in scala tympani has the opposite sign. The pressure gradient drives the volume velocity through Newton’s second law applied to fluid (the Euler equation for inviscid fluid):
where is the pressure in scala vestibuli and is the perilymph density. Doing the same for scala tympani (with ) and subtracting gives an equation for the pressure difference :
where the factor of 2 comes from the two scalae both contributing.
Differentiating with respect to and substituting :
If is roughly constant or absorbed into the definition of impedance, and we drop the explicit width, we recover
(the sign convention is set by the direction of relative to pressure increase). Substituting and going to frequency-domain ( with the convention):
This is the wave equation with
This is the cochlear dispersion relation. ∎
This derivation is simplified — it elides the detailed treatment of the basilar-membrane width , the geometry of the scalae, and the precise boundary conditions at the stapes and helicotrema. For a careful treatment, see Lighthill, Waves in Fluids (1978), chapters 4–5; or Steele & Taber (1979) for the original modern long-wave derivation; or chapters 6–8 of Pickles’ Introduction to the Physiology of Hearing.
This is the cochlear dispersion relation, and reading it carefully tells the entire story of the traveling wave.
Look at what does in three regions of :
- For basal of the characteristic place — that is, — the membrane’s impedance is dominated by its stiffness, is large, and is small. The wavelength is long. The wave propagates freely and quickly. This is the long-wave region.
- At the characteristic place , where , the impedance has its smallest magnitude — only the damping remains. becomes large and acquires comparable real and imaginary parts. The wavelength collapses, the phase velocity goes to zero, and the wave’s energy density piles up at this location. The analogy that always helps me here is a long ocean swell rolling onto a shoaling beach: as the depth decreases, the wave slows, the wavelength shrinks, and the wave’s amplitude builds. The same thing is happening in your cochlea right now.
- Past the characteristic place — — the impedance becomes mass-dominated, and rotates into the regime where its imaginary part dominates. The carrier dies. This is the evanescent region, and it is what makes the cochlea such a sharp filter: the wave cannot continue past its characteristic place.
WKB and the traveling wave
In the WKB approximation (valid because the cochlea’s properties vary slowly on the wavelength scale, at least in the long-wave region), the solution to the wave equation looks like
▶ Derivation: the WKB approximation
The exact wave equation has no closed-form solution when varies with . But when varies slowly — meaning — we can find an approximate solution.
Guess where is a slowly varying amplitude and is the phase. Take derivatives:
Substitute into the wave equation:
Separate real and imaginary parts:
Real: Imag:
For slowly-varying , drop compared to . The real part gives , so (choosing the right-going branch), and .
The imaginary part can be rewritten as , so is constant. Since , is constant, so .
Combining, the WKB approximation is
The phase tracks the cumulative phase accumulation; the prefactor comes from energy conservation (where the wave slows, the amplitude of pressure grows in just the right way to keep the energy flux constant). ∎
The exponential carries a phase that accumulates along the wave’s journey from the stapes; the prefactor is the slowly-varying envelope. The basilar-membrane displacement is then , and the displacement amplitude peaks sharply where is smallest — at . The peak’s height scales with . Its width scales with . We are back to the resonance curve of 4.2, but now it has been painted spatially along the cochlea.
The interactive below renders the traveling wave for one input frequency at a time, using a simplified version of the cochlear long-wave model. Drag the slider. Watch the peak migrate along the basilar membrane; watch the wavelength shrink as the wave approaches the characteristic place; watch the wave die past it.
- characteristic place
- 15.2 mm from stapes
- Q (fixed)
- 8
A few things worth flagging.
First, the model in the interactive is the passive cochlear wave — what you would observe in a dead cochlea, or in any cochlea at very high stimulus levels where the active feedback I will introduce in 4.5 has saturated. Real living cochleas show a much sharper, taller, more asymmetric peak at much lower stimulus levels, because of the outer hair cells. The passive wave tells us where the peak will be and roughly what shape it has. 4.5 will tell us how that peak gets sharpened.
Second, the wave’s phase accumulates as it travels. At the characteristic place, the phase has lagged the input by several full cycles — perhaps five to ten, depending on frequency. This will matter in movement 6: when the auditory nerve fires in phase-locked synchrony to the BM motion, the spikes inherit this phase lag, and the brain can in principle use the inter-channel timing for fine localization and pitch.
Third, and most importantly, this is the engine of place coding for real sounds. Send in a complex stimulus like “Hey Dr. Miles!” and the cochlea performs a near-instantaneous Fourier analysis: each frequency component sets up its own traveling wave with its own characteristic place. The /h/ aspiration’s broadband energy lights up the basal half of the BM. The /m/ nasal resonance lights up a band around 1 kHz. The diphthong /aɪ/ in Miles has formants in the range 600 to 2200 Hz that paint stripes further along the membrane. By the time the phrase has played, the entire basilar membrane has been mapped — in space, in time, and in amplitude — into a 35-mm-long picture of what was said. We will look at that picture, in full, at the end of movement 5.
Section 4.4 will write down the precise map from frequency to characteristic place — the Greenwood function. With that, we will know exactly what the cochlea sends downstream.