4.7 What leaves the cochlea

Chapter 4 has been long and dense, so let me close it with a single picture worth committing to memory.

Inside the cochlea, a sound has become a pattern of activity along the basilar membrane in space and time. Place codes frequency (Greenwood, 4.4). Amplitude codes intensity (and is sharpened by the cochlear amplifier, 4.5). Timing is preserved with sub-millisecond fidelity (4.6’s transduction is fast). What leaves through the ~30,000 spiral-ganglion fibers of the auditory nerve is, in effect, a multi-channel cochleagram — a representation of the input sound in which one axis is time and the other is cochlear place. The cochleagram below is a stylized rendering of “Hey Dr. Miles!” in exactly this representation: time runs left-to-right, place runs top-to-bottom (base at top, apex at bottom), and intensity is shown by colour. The phoneme labels mark the temporal locations.

0 mm7 mm14 mm21 mm28 mm35 mm8k Hz2k Hz500 Hz125 Hz/h//eɪ//d//r//m//aɪ//l//z/time →place along the basilar membranebaseapex

Reading this picture from left to right is reading the sentence. The broadband /h/ aspiration lights up a wide stripe in the basal half. The /eɪ/ vowel of “Hey” shows its first and second formants as two stripes — a low band around the middle of the cochlea (F1 ≈ 500 Hz) and a higher band closer to the base (F2 ≈ 2200 Hz). After a brief pause, the /d/ stop and the /r/ glide of “Doctor” show similar formant structure. The /m/ nasal of “Miles” is the lowest, narrowest band — a single dominant resonance around 1.1 kHz. The diphthong /aɪ/ that follows has a moving F2 that you would see as a slow drift if this picture were rendered at higher temporal resolution. The final /z/ fricative is concentrated in the basal third above 4 kHz — exactly where the high-frequency consonants live, and exactly the region that goes first in noise-induced hearing loss.

This is what the cochlea sends to the brain. Not a sound. Not even a spectrogram. A picture in cochlear coordinates, sampled by 3,500 inner hair cells, encoded in the spike trains of 30,000 auditory-nerve fibers, transmitted on the auditory nerve at sub-millisecond timing fidelity. Everything that happens in movements 6 through 9 is the brain’s interpretation of this picture.

We turn to the spike trains themselves in movement 6.