4.2 The stiffness gradient

The basilar membrane is, dimensionally, a strip of tissue that runs the length of the cochlea — about 35 mm uncoiled. Its width and stiffness change along that length, and the change is not subtle.

At the base, near the stapes, the membrane is narrow and stiff — roughly 100 μm across. At the apex, near the helicotrema, the membrane is wide and floppy — roughly 500 μm across. The width grows monotonically by about a factor of five. The stiffness changes monotonically too, but by orders of magnitude: roughly a factor of one hundred between base and apex.

baseapex
coiled (top-down)
base0 mmapex35 mmnarrow, stiffwide, floppy240 μm
unrolled (the same membrane, straight)
position
12.3 mm from base
local width
240 μm
local stiffness
20.0 (relative)
characteristic frequency
1.8 kHz

This is the single most important fact about cochlear mechanics, and it is a fact that does not exist in your everyday intuition for membranes. A drumhead is everywhere the same: stretch a single material to a single tension, and the whole drumhead has one resonance. The basilar membrane is the opposite. Every place along its length is mechanically different from every other place. You can think of it, very roughly, as a continuum of tiny drumheads stacked side by side, each with its own preferred frequency.

The local oscillator

Consider a small element of the membrane at coordinate xx, where xx runs from 00 at the base to L=35mmL = 35\,\text{mm} at the apex. Let η(x,t)\eta(x, t) be the transverse displacement of the membrane at that point — perpendicular to its surface. To a first approximation, the cross-sectional motion of that element behaves like a damped, driven simple harmonic oscillator with local effective mass per unit length m(x)m(x), local effective stiffness per unit length k(x)k(x), and local damping b(x)b(x):

m(x)η¨(x,t)+b(x)η˙(x,t)+k(x)η(x,t)=P(x,t),m(x)\, \ddot{\eta}(x, t) + b(x)\, \dot{\eta}(x, t) + k(x)\, \eta(x, t) = P(x, t),

where P(x,t)P(x, t) is the local pressure difference across the membrane — the force per unit area that drives it. (We will write down where P(x,t)P(x, t) comes from in 4.3; for now, treat it as the input.)

Derivation: where the damped, driven oscillator equation comes from

We start from Newton’s second law applied to a small element of basilar membrane at position xx.

The element has mass per unit area m(x)m(x) (units: kg/m²). At the instant in question, the element’s transverse displacement is η\eta, its velocity is η˙\dot\eta, and its acceleration is η¨\ddot\eta.

Three forces per unit area act on the element:

  1. Restoring force from the membrane’s own elasticity. For small displacements, this is linear in η\eta, opposing the displacement: Fspring=k(x)ηF_\text{spring} = -k(x)\,\eta. The constant k(x)k(x) is the local stiffness per unit area (units: N/m³).

  2. Damping force from internal viscoelastic losses and from the surrounding fluid’s viscosity. For small motions, this is linear in velocity, opposing motion: Fdamp=b(x)η˙F_\text{damp} = -b(x)\,\dot\eta. The constant b(x)b(x) is the local damping coefficient per unit area (units: N·s/m³).

  3. Driving pressure from the fluid above and below the membrane: Fdrive=P(x,t)F_\text{drive} = P(x, t) (units: N/m² = Pa).

Newton’s second law per unit area says: m(x)η¨=F=k(x)ηb(x)η˙+P(x,t)m(x)\,\ddot\eta = \sum F = -k(x)\,\eta - b(x)\,\dot\eta + P(x, t).

Rearranging:

m(x)η¨+b(x)η˙+k(x)η=P(x,t).m(x)\,\ddot\eta + b(x)\,\dot\eta + k(x)\,\eta = P(x, t).

This is the equation in the main text. It is the standard linear damped driven oscillator, written per unit area of the membrane.

The local natural angular frequency of this element, in the absence of damping and driving, is

ω0(x)=k(x)m(x).\omega_0(x) = \sqrt{\frac{k(x)}{m(x)}}.
Derivation: the natural frequency of a spring-mass system

Consider the equation with damping and driving turned off:

mη¨+kη=0,m\,\ddot\eta + k\,\eta = 0,

or equivalently η¨=(k/m)η\ddot\eta = -(k/m)\,\eta. This says: acceleration is proportional to displacement, with the proportionality constant negative. Functions whose second derivative equals minus a constant times themselves are sines and cosines.

Try the ansatz η(t)=Acos(ωt+ϕ)\eta(t) = A\cos(\omega t + \phi). Then η¨=Aω2cos(ωt+ϕ)=ω2η\ddot\eta = -A\omega^2 \cos(\omega t + \phi) = -\omega^2 \eta. Substituting into the equation: mω2η+kη=0-m\omega^2 \eta + k\eta = 0, so ω2=k/m\omega^2 = k/m, so

ω=k/m.\omega = \sqrt{k/m}.

The undriven, undamped oscillator oscillates at this frequency forever (in the absence of energy loss). This is the natural frequency ω0\omega_0.

The mass per unit length varies modestly along xx. The stiffness per unit length varies by about two orders of magnitude. The result is that ω0(x)\omega_0(x) drops by about three orders of magnitude between base and apex, sweeping from roughly 2π20kHz2\pi \cdot 20\,\text{kHz} at the basal end to 2π20Hz2\pi \cdot 20\,\text{Hz} at the apex. In one elegant move, the cochlea has produced a one-dimensional frequency axis, embedded in its geometry, spanning the entire range of human hearing.

I want to flag two things about the model I just wrote down, because I will sin briefly against both and then earn them back.

First, this oscillator equation treats each place as independent. It pretends the membrane is a row of disconnected springs, each tuned to its own frequency. It is not: the perilymph mechanically couples adjacent places, and a disturbance at one xx drives motion at nearby xx. That coupling is exactly what produces a traveling wave rather than a set of local resonances, and it is the subject of 4.3.

Second, I have just put a damping term in the equation without saying anything about how big it should be. The size of b(x)b(x) controls the sharpness of each local resonance — the Q factor — and it turns out that the passive damping of the basilar membrane is large enough to produce only modestly sharp tuning. The exquisite frequency selectivity that human hearing actually has comes from somewhere else: from an active element, the outer hair cell, which injects energy back into the membrane on each cycle and effectively reduces the damping. We will spend 4.5 on this, and it will reframe everything in this section.

The driven oscillator at steady state

That oscillator equation deserves more than a passing acquaintance. It is the central equation of damped, driven harmonic motion, and its steady-state behavior is a story worth telling slowly. Drop the position dependence for a moment and ask what one place along the cochlea is doing when a sinusoidal pressure P(t)=P0cosωtP(t) = P_0 \cos\omega t drives it. The equation has a closed-form steady-state solution: a sinusoid at the same frequency, scaled in amplitude and shifted in phase relative to the input. The amplitude and phase are

H(ω)=1m(ω02ω2)2+γ2ω2,ϕ(ω)=arctan ⁣(γωω02ω2),|H(\omega)| = \frac{1}{m\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2}}, \qquad \phi(\omega) = -\arctan\!\left(\frac{\gamma\, \omega}{\omega_0^2 - \omega^2}\right),

where γ=b/m\gamma = b/m, and η(t)=H(ω)P0cos(ωt+ϕ)\eta(t) = |H(\omega)|\, P_0 \cos(\omega t + \phi).

Derivation: amplitude and phase of the driven oscillator

We want the steady-state solution of

mη¨+bη˙+kη=P0cosωt.m\,\ddot\eta + b\,\dot\eta + k\,\eta = P_0 \cos\omega t.

The trick is to work in complex numbers. Write the driving force as the real part of P0eiωtP_0\, e^{i\omega t}, and look for a complex solution η(t)=H~(ω)P0eiωt\eta(t) = \tilde H(\omega)\, P_0\, e^{i\omega t}, then take its real part at the end.

Substituting into the equation:

(mω2+ibω+k)H~P0eiωt=P0eiωt.(-m\omega^2 + ib\omega + k)\,\tilde H\, P_0\, e^{i\omega t} = P_0\, e^{i\omega t}.

Dividing through by P0eiωtP_0 e^{i\omega t} (since P0eiωt0P_0 e^{i\omega t} \neq 0):

H~(ω)=1kmω2+ibω.\tilde H(\omega) = \frac{1}{k - m\omega^2 + ib\omega}.

This is a complex number. Its magnitude gives the amplitude scaling, and its argument gives the phase shift.

For the magnitude, use z2=zzˉ=Re(z)2+Im(z)2|z|^2 = z \bar z = \mathrm{Re}(z)^2 + \mathrm{Im}(z)^2 on the denominator:

kmω2+ibω2=(kmω2)2+(bω)2.|k - m\omega^2 + ib\omega|^2 = (k - m\omega^2)^2 + (b\omega)^2.

So

H~(ω)=1(kmω2)2+(bω)2.|\tilde H(\omega)| = \frac{1}{\sqrt{(k - m\omega^2)^2 + (b\omega)^2}}.

Factor out an mm from inside the square root, using ω02=k/m\omega_0^2 = k/m and γ=b/m\gamma = b/m:

H~(ω)=1m(ω02ω2)2+γ2ω2.|\tilde H(\omega)| = \frac{1}{m\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2\omega^2}}.

This is the magnitude in the main text.

For the phase, use arg(1/z)=arg(z)\arg(1/z) = -\arg(z):

ϕ(ω)=arg(kmω2+ibω)=arctan ⁣(bωkmω2)=arctan ⁣(γωω02ω2).\phi(\omega) = -\arg(k - m\omega^2 + ib\omega) = -\arctan\!\left(\frac{b\omega}{k - m\omega^2}\right) = -\arctan\!\left(\frac{\gamma\omega}{\omega_0^2 - \omega^2}\right).

So the real-part solution is η(t)=H~(ω)P0cos(ωt+ϕ)\eta(t) = |\tilde H(\omega)|\, P_0 \cos(\omega t + \phi). ∎

Normalize the amplitude to its zero-frequency value H(0)=1/(mω02)|H(0)| = 1/(m\,\omega_0^2), and define the quality factor Q=ω0/γQ = \omega_0/\gamma. The ratio takes the cleanest possible form:

H(ω)H(0)=ω02(ω02ω2)2+γ2ω2.\frac{|H(\omega)|}{|H(0)|} = \frac{\omega_0^2}{\sqrt{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2}}.

At ω=ω0\omega = \omega_0 this ratio is exactly QQ. That is the entire content of resonance: at its natural frequency, an oscillator responds QQ times more strongly than it does at zero frequency. Off resonance, the response falls off, with a bandwidth (full width at half-power) of approximately ω0/Q\omega_0/Q.

Derivation: peak height and bandwidth in terms of Q

At ω=ω0\omega = \omega_0, the denominator becomes 0+γ2ω02=γω0\sqrt{0 + \gamma^2 \omega_0^2} = \gamma\omega_0. The amplitude ratio is then ω02/(γω0)=ω0/γ=Q\omega_0^2 / (\gamma\omega_0) = \omega_0/\gamma = Q.

For the bandwidth, find the frequencies where H(ω)/H(0)=Q/2|H(\omega)|/|H(0)| = Q/\sqrt{2} (the half-power points). Setting the squared ratio to Q2/2Q^2/2:

ω04(ω02ω2)2+γ2ω2=Q22=ω022γ2.\frac{\omega_0^4}{(\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2} = \frac{Q^2}{2} = \frac{\omega_0^2}{2\gamma^2}.

Cross-multiplying:

2γ2ω02=(ω02ω2)2+γ2ω2.2\gamma^2 \omega_0^2 = (\omega_0^2 - \omega^2)^2 + \gamma^2 \omega^2.

For high QQ (γω0\gamma \ll \omega_0), the resonance is narrow and we can work near ωω0\omega \approx \omega_0. Write ω=ω0+Δω\omega = \omega_0 + \Delta\omega where Δωω0\Delta\omega \ll \omega_0. Then ω2ω022ω0Δω\omega^2 - \omega_0^2 \approx 2\omega_0\,\Delta\omega, and ω2ω02\omega^2 \approx \omega_0^2. Substituting:

2γ2ω024ω02(Δω)2+γ2ω02,2\gamma^2 \omega_0^2 \approx 4\omega_0^2 (\Delta\omega)^2 + \gamma^2 \omega_0^2,

so γ2ω024ω02(Δω)2\gamma^2 \omega_0^2 \approx 4\omega_0^2 (\Delta\omega)^2, giving Δω=±γ/2\Delta\omega = \pm \gamma/2. The full width between half-power points is γ=ω0/Q\gamma = \omega_0/Q. ∎

The interactive below is one of these oscillators, with f0f_0 fixed at 1 kHz. The top plot is H(f)/H(0)|H(f)|/|H(0)| on log-log axes — the response curve of one place along the cochlea. The spring-mass-damper diagram and the phasor diagram show the physical and complex-plane views of the same dynamics; the time-domain trace shows the input and steady-state response over four cycles. Drag the sliders and watch what happens at and around resonance.

100 Hz1 kHz10 kHz0.010.1110100f₀ = 1 kHzdriving frequency f|H(f)| / |H(0)|
kbmP(t) = cos(ω t)
ReIm|H|=1inputresponse
input: cos(ω t)response: η(t)time → (4 cycles of the driving frequency)
700 Hz
10.0
f / f₀
0.700
|H(f)| / |H(0)|
1.94
phase lag
-7.8°

A few things worth noticing as you move the sliders.

Well below f0f_0, the response amplitude is roughly constant and in phase with the input. The spring dominates: the oscillator is too stiff to care about a slow push. Well above f0f_0, the amplitude is even smaller, and the response is 180° out of phase with the input — the mass dominates; the oscillator is too sluggish to keep up. The transition between these two regimes happens at f0f_0, where the phase passes through exactly 90°-90° and the amplitude peaks at QQ. The Bode-style curves you are looking at are the same curves that appear in every electrical-engineering textbook on filters, because a passive electrical filter is mathematically the same object as a damped, driven mechanical oscillator.

QQ controls only the sharpness and height of the peak. The low- and high-frequency asymptotes are unaffected. At Q3Q \approx 3 to 55 — the kind of QQ a single, isolated, passive basilar-membrane segment would show — the resonance is broad and unimpressive. The remarkable selectivity of the real cochlea, where effective QQ values can reach 3030, 5050, even past 100100 at low signal levels, is the surprise that 4.5 will explain.

This is also where the place-coding seed becomes concrete. Every location along the basilar membrane is one of these oscillators. The local ω0(x)\omega_0(x) is its natural frequency; the local damping sets its QQ. A pure tone of frequency ff drives a response that peaks at the location whose ω0(x)=2πf\omega_0(x) = 2\pi f, and decays away from that place. Every frequency in “Hey Dr. Miles!” — the /h/ aspiration, the /m/ nasal, the formants of /aɪ/ in Miles — will end up driving membrane motion at a particular location, and the brain will read which locations moved much the same way it reads which key was pressed on a piano. The exact map between frequency and place — the curve that gives f0(x)f_0(x) for the human cochlea — has a name (the Greenwood function) and we will derive it in 4.4. But first we have to honor the other confession from a few paragraphs ago: the basilar membrane is not actually a set of independent oscillators. The fluid couples them, and the result of that coupling is the wave we are about to meet.