8.2 Bayes’ theorem for perception

The most natural formalism for inference under uncertainty is Bayes’ theorem (refresher: Foundations 11.5). Let $M$ be a hypothesis about what is happening in the world (a particular meaning, percept, or sound source) and $S$ be the sensory input. The brain wants to compute $P(M | S)$ — the probability of each hypothesis given the data. Bayes’ theorem gives this as

P(M | S) = \frac{P(S | M)\, P(M)}{P(S)}.

▶ Derivation: Bayes' theorem from the definition of conditional probability Derivation

The conditional probability of $A$ given $B$ is defined as

P(A | B) = \frac{P(A \cap B)}{P(B)}, \qquad P(B | A) = \frac{P(A \cap B)}{P(A)}.

Solving each for the joint probability $P(A \cap B)$ :

P(A \cap B) = P(A|B)\, P(B) = P(B|A)\, P(A).

Equating and solving for $P(A|B)$ :

P(A | B) = \frac{P(B | A)\, P(A)}{P(B)}.

In the perception context, $A = M$ (the hypothesis about the world’s state) and $B = S$ (the sensory input):

P(M | S) = \frac{P(S | M)\, P(M)}{P(S)}.

The denominator $P(S)$ is the marginal probability of the data — it can be computed as $\sum_M P(S|M) P(M)$ (summing over all hypotheses). For perceptual purposes the denominator is just a normalizing constant. ∎

The denominator $P(S)$ is just a normalizing constant; the interesting content is in the numerator. The brain combines:

$P(S | M)$ — the likelihood. How probable is the observed sensory input, given that hypothesis $M$ is true? This depends on the generative model the brain has of how the world produces sensations.
$P(M)$ — the prior. How probable is hypothesis $M$ before any sensory input? This depends on context, expectations, experience, and recent history.

The posterior $P(M | S)$ is what the brain perceives. When the likelihood is sharp (the sensory data is unambiguous), the posterior follows the data. When the likelihood is vague (the data is noisy or incomplete), the posterior follows the prior. This is exactly why ambiguous stimuli are perceived in a way that depends on context, while clear stimuli are not.

prior P(A)

50%

stimulus bias

⇄

likelihood sharpness

50%

Sharp likelihood + clear stimulus: the posterior follows the data, the prior matters little. Vague likelihood + uninformative stimulus: the posterior follows the prior. This is what "fills in" missing phonemes or completes interrupted speech.

This is more than a clean formalism. It is a specific prediction: the brain’s percept of an ambiguous stimulus should depend on prior expectations in a way described by Bayes’ theorem. Many decades of psychophysics — across vision, audition, and touch — have shown that this prediction holds, at least approximately, for a wide range of perceptual tasks. The brain is, behaviorally, an approximate Bayesian inference engine.