4.2 The stiffness gradient

The basilar membrane is, dimensionally, a strip of tissue that runs the length of the cochlea — about 35 mm uncoiled. Its width and stiffness change along that length, and the change is not subtle.

At the base, near the stapes, the membrane is narrow and stiff — roughly 100 μm across. At the apex, near the helicotrema, the membrane is wide and floppy — roughly 500 μm across. The width grows monotonically by about a factor of five. The stiffness changes monotonically too, but by orders of magnitude: roughly a factor of one hundred between base and apex.

coiled (top-down)

unrolled (the same membrane, straight)

position: 12.3 mm from base
local width: 240 μm
local stiffness: 20.0 (relative)
characteristic frequency: 1.8 kHz

This is the single most important fact about cochlear mechanics, and it is a fact that does not exist in your everyday intuition for membranes. A drumhead is everywhere the same: stretch a single material to a single tension, and the whole drumhead has one resonance. The basilar membrane is the opposite. Every place along its length is mechanically different from every other place. You can think of it, very roughly, as a continuum of tiny drumheads stacked side by side, each with its own preferred frequency.

The local oscillator

Consider a small element of the membrane at coordinate $x$ , where $x$ runs from $0$ at the base to $L = 35\,\text{mm}$ at the apex. Let $\eta(x, t)$ be the transverse displacement of the membrane at that point — perpendicular to its surface. To a first approximation, the cross-sectional motion of that element behaves like a damped, driven simple harmonic oscillator with local effective mass per unit length $m(x)$ , local effective stiffness per unit length $k(x)$ , and local damping $b(x)$ :

m(x)\, \ddot{\eta}(x, t) + b(x)\, \dot{\eta}(x, t) + k(x)\, \eta(x, t) = P(x, t),

where $P(x, t)$ is the local pressure difference across the membrane — the force per unit area that drives it. (We will write down where $P(x, t)$ comes from in 4.3; for now, treat it as the input.)

▶ Derivation: where the damped, driven oscillator equation comes from Derivation

We start from Newton’s second law applied to a small element of basilar membrane at position $x$ .

The element has mass per unit area $m(x)$ (units: kg/m²). At the instant in question, the element’s transverse displacement is $\eta$ , its velocity is $\dot\eta$ , and its acceleration is $\ddot\eta$ .

Three forces per unit area act on the element:

Restoring force from the membrane’s own elasticity. For small displacements, this is linear in $\eta$ , opposing the displacement: $F_\text{spring} = -k(x)\,\eta$ . The constant $k(x)$ is the local stiffness per unit area (units: N/m³).
Damping force from internal viscoelastic losses and from the surrounding fluid’s viscosity. For small motions, this is linear in velocity, opposing motion: $F_\text{damp} = -b(x)\,\dot\eta$ . The constant $b(x)$ is the local damping coefficient per unit area (units: N·s/m³).
Driving pressure from the fluid above and below the membrane: $F_\text{drive} = P(x, t)$ (units: N/m² = Pa).

Newton’s second law per unit area says: $m(x)\,\ddot\eta = \sum F = -k(x)\,\eta - b(x)\,\dot\eta + P(x, t)$ .

Rearranging:

m(x)\,\ddot\eta + b(x)\,\dot\eta + k(x)\,\eta = P(x, t).

This is the equation in the main text. It is the standard linear damped driven oscillator, written per unit area of the membrane.

The local natural angular frequency of this element, in the absence of damping and driving, is

\omega_0(x) = \sqrt{\frac{k(x)}{m(x)}}.

▶ Derivation: the natural frequency of a spring-mass system Derivation

Consider the equation with damping and driving turned off:

m\,\ddot\eta + k\,\eta = 0,

or equivalently $\ddot\eta = -(k/m)\,\eta$ . This says: acceleration is proportional to displacement, with the proportionality constant negative. Functions whose second derivative equals minus a constant times themselves are sines and cosines.

Try the ansatz $\eta(t) = A\cos(\omega t + \phi)$ . Then $\ddot\eta = -A\omega^2 \cos(\omega t + \phi) = -\omega^2 \eta$ . Substituting into the equation: $-m\omega^2 \eta + k\eta = 0$ , so $\omega^2 = k/m$ , so

\omega = \sqrt{k/m}.

The undriven, undamped oscillator oscillates at this frequency forever (in the absence of energy loss). This is the natural frequency $\omega_0$ .

The mass per unit length varies modestly along $x$ . The stiffness per unit length varies by about two orders of magnitude. The result is that $\omega_0(x)$ drops by about three orders of magnitude between base and apex, sweeping from roughly $2\pi \cdot 20\,\text{kHz}$ at the basal end to $2\pi \cdot 20\,\text{Hz}$ at the apex. The cochlea has produced a one-dimensional frequency axis, embedded in its geometry, spanning the entire range of human hearing.

Two things about the model just written down need noting, because what follows bends both before earning them back.

First, this oscillator equation treats each place as independent. It pretends the membrane is a row of disconnected springs, each tuned to its own frequency. It is not: the perilymph mechanically couples adjacent places, and a disturbance at one $x$ drives motion at nearby $x$ . That coupling is exactly what produces a traveling wave rather than a set of local resonances, and it is the subject of 4.3.

Second, I have just put a damping term in the equation without saying anything about how big it should be. The size of $b(x)$ controls the sharpness of each local resonance — the Q factor — and it turns out that the passive damping of the basilar membrane is large enough to produce only modestly sharp tuning. The exquisite frequency selectivity that human hearing actually has comes from somewhere else: from an active element, the outer hair cell, which injects energy back into the membrane on each cycle and effectively reduces the damping. Section 4.5 develops this mechanism.

The driven oscillator at steady state

That oscillator equation deserves more than a passing acquaintance. It is the central equation of damped, driven harmonic motion, and its steady-state behavior is a story worth telling slowly. Drop the position dependence for a moment and ask what one place along the cochlea is doing when a sinusoidal pressure $P(t) = P_0 \cos\omega t$ drives it. The equation has a closed-form steady-state solution: a sinusoid at the same frequency, scaled in amplitude and shifted in phase relative to the input. The amplitude and phase are

|H(\omega)| = \frac{1}{m\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2}}, \qquad \phi(\omega) = -\arctan\!\left(\frac{\gamma\, \omega}{\omega_0^2 - \omega^2}\right),

where $\gamma = b/m$ , and $\eta(t) = |H(\omega)|\, P_0 \cos(\omega t + \phi)$ .

▶ Derivation: amplitude and phase of the driven oscillator Derivation

We want the steady-state solution of

m\,\ddot\eta + b\,\dot\eta + k\,\eta = P_0 \cos\omega t.

The trick is to work in complex numbers. Write the driving force as the real part of $P_0\, e^{i\omega t}$ , and look for a complex solution $\eta(t) = \tilde H(\omega)\, P_0\, e^{i\omega t}$ , then take its real part at the end.

Substituting into the equation:

(-m\omega^2 + ib\omega + k)\,\tilde H\, P_0\, e^{i\omega t} = P_0\, e^{i\omega t}.

Dividing through by $P_0 e^{i\omega t}$ (since $P_0 e^{i\omega t} \neq 0$ ):

\tilde H(\omega) = \frac{1}{k - m\omega^2 + ib\omega}.

This is a complex number. Its magnitude gives the amplitude scaling, and its argument gives the phase shift.

For the magnitude, use $|z|^2 = z \bar z = \mathrm{Re}(z)^2 + \mathrm{Im}(z)^2$ on the denominator:

|k - m\omega^2 + ib\omega|^2 = (k - m\omega^2)^2 + (b\omega)^2.

|\tilde H(\omega)| = \frac{1}{\sqrt{(k - m\omega^2)^2 + (b\omega)^2}}.

Factor out an $m$ from inside the square root, using $\omega_0^2 = k/m$ and $\gamma = b/m$ :

|\tilde H(\omega)| = \frac{1}{m\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2\omega^2}}.

This is the magnitude in the main text.

For the phase, use $\arg(1/z) = -\arg(z)$ :

\phi(\omega) = -\arg(k - m\omega^2 + ib\omega) = -\arctan\!\left(\frac{b\omega}{k - m\omega^2}\right) = -\arctan\!\left(\frac{\gamma\omega}{\omega_0^2 - \omega^2}\right).

So the real-part solution is $\eta(t) = |\tilde H(\omega)|\, P_0 \cos(\omega t + \phi)$ . ∎

Normalize the amplitude to its zero-frequency value $|H(0)| = 1/(m\,\omega_0^2)$ , and define the quality factor $Q = \omega_0/\gamma$ . The ratio takes the cleanest possible form:

\frac{|H(\omega)|}{|H(0)|} = \frac{\omega_0^2}{\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2}}.

At $\omega = \omega_0$ this ratio is exactly $Q$ . That is the entire content of resonance: at its natural frequency, an oscillator responds $Q$ times more strongly than it does at zero frequency. Off resonance, the response falls off, with a bandwidth (full width at half-power) of approximately $\omega_0/Q$ .

▶ Derivation: peak height and bandwidth in terms of Q Derivation

At $\omega = \omega_0$ , the denominator becomes $\sqrt{0 + \gamma^2 \omega_0^2} = \gamma\omega_0$ . The amplitude ratio is then $\omega_0^2 / (\gamma\omega_0) = \omega_0/\gamma = Q$ .

For the bandwidth, find the frequencies where $|H(\omega)|/|H(0)| = Q/\sqrt{2}$ (the half-power points). Setting the squared ratio to $Q^2/2$ :

\frac{\omega_0^4}{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2} = \frac{Q^2}{2} = \frac{\omega_0^2}{2\gamma^2}.

Cross-multiplying:

2\gamma^2 \omega_0^2 = (\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2.

For high $Q$ ( $\gamma \ll \omega_0$ ), the resonance is narrow and we can work near $\omega \approx \omega_0$ . Write $\omega = \omega_0 + \Delta\omega$ where $\Delta\omega \ll \omega_0$ . Then $\omega^2 - \omega_0^2 \approx 2\omega_0\,\Delta\omega$ , and $\omega^2 \approx \omega_0^2$ . Substituting:

2\gamma^2 \omega_0^2 \approx 4\omega_0^2 (\Delta\omega)^2 + \gamma^2 \omega_0^2,

so $\gamma^2 \omega_0^2 \approx 4\omega_0^2 (\Delta\omega)^2$ , giving $\Delta\omega = \pm \gamma/2$ . The full width between half-power points is $\gamma = \omega_0/Q$ . ∎

The interactive below is one of these oscillators, with $f_0$ fixed at 1 kHz. The top plot is $|H(f)|/|H(0)|$ on log-log axes — the response curve of one place along the cochlea. The spring-mass-damper diagram and the phasor diagram show the physical and complex-plane views of the same dynamics; the time-domain trace shows the input and steady-state response over four cycles. Drag the sliders and watch what happens at and around resonance.

driving frequency f

700 Hz

quality factor Q

10.0

f / f₀: 0.700
|H(f)| / |H(0)|: 1.94
phase lag: -7.8°

A few things worth noticing as you move the sliders.

Well below $f_0$ , the response amplitude is roughly constant and in phase with the input. The spring dominates: the oscillator is too stiff to care about a slow push. Well above $f_0$ , the amplitude is even smaller, and the response is 180° out of phase with the input — the mass dominates; the oscillator is too sluggish to keep up. The transition between these two regimes happens at $f_0$ , where the phase passes through exactly $-90°$ and the amplitude peaks at $Q$ . The Bode-style curves you are looking at are the same curves that appear in every electrical-engineering textbook on filters, because a passive electrical filter is mathematically the same object as a damped, driven mechanical oscillator.

$Q$ controls only the sharpness and height of the peak. The low- and high-frequency asymptotes are unaffected. At $Q \approx 3$ to $5$ — the kind of $Q$ a single, isolated, passive basilar-membrane segment would show — the resonance is broad and unimpressive. The remarkable selectivity of the real cochlea, where effective $Q$ values can reach $30$ , $50$ , even past $100$ at low signal levels, is the surprise that 4.5 will explain.

This is also where the place-coding seed becomes concrete. Every location along the basilar membrane is one of these oscillators. The local $\omega_0(x)$ is its natural frequency; the local damping sets its $Q$ . A pure tone of frequency $f$ drives a response that peaks at the location whose $\omega_0(x) = 2\pi f$ , and decays away from that place. Every frequency in “Hey Dr. Miles!” — the /h/ aspiration, the /m/ nasal, the formants of /aɪ/ in Miles — will end up driving membrane motion at a particular location, and the brain will read which locations moved much the same way it reads which key was pressed on a piano. The exact map between frequency and place — the curve that gives $f_0(x)$ for the human cochlea — has a name (the Greenwood function) and we will derive it in 4.4. But first we have to honor the other confession from a few paragraphs ago: the basilar membrane is not actually a set of independent oscillators. The fluid couples them, and the result of that coupling is the wave we are about to meet.

⏳ The history — Helmholtz and the resonance theory of hearing

Hermann von Helmholtz proposed in his 1863 Die Lehre von den Tonempfindungen (On the Sensations of Tone) that the cochlea performs frequency analysis by resonance: that structures of graded stiffness along the basilar membrane act as a bank of tuned resonators, each responding selectively to its matched frequency. The idea was motivated by his own experiments with tuning forks and acoustic resonators, and by the anatomical observation (then recent) that the basilar membrane is narrow and stiff at the base, wide and compliant at the apex. Helmholtz imagined the transverse fibers of the basilar membrane as independent strings, each tuned to a different pitch — a “piano inside the ear.”

The resonance theory was qualitatively correct: the cochlea is indeed a frequency analyser, and basilar-membrane stiffness does grade from base to apex. But the mechanism is not independent resonators. Békésy showed in the 1940s that the membrane supports a traveling wave whose envelope peaks at the frequency-matched place. The modern picture — active, nonlinear, and wave-based — descends from Helmholtz’s insight but replaces his piano strings with coupled fluid-membrane dynamics.

Read the original: On the Sensations of Tone (Hermann von Helmholtz (tr. Alexander Ellis, 1895), 1863)