11.4 Poisson processes

A Poisson process is the simplest model of random arrivals: events happen at a constant average rate, independently of each other and of past history. Photon arrivals from a faint source, radioactive decays in a sample, customer arrivals at a bank, action potentials in an auditory-nerve fibre — all are well-modelled as Poisson processes over short enough timescales. The Poisson process has a single parameter (the rate $\lambda$ ) and three statistical signatures: exponentially-distributed waiting times, Poisson-distributed counts, and complete temporal independence.

This lesson develops all three.

Setup: events at a constant rate

A Poisson process is defined by these properties:

Independence. Events in disjoint time intervals are independent.
Constant rate. Over a very short interval $dt$ , the probability of one event is $\lambda\, dt$ and the probability of two or more is $o(dt)$ (vanishingly small).
Initial condition. The process starts at $t = 0$ with zero events recorded.

From these axioms, every statistical property of the process follows.

The Poisson distribution: count statistics

Let $N(T)$ be the number of events occurring in the interval $[0, T]$ . By the axioms above, $N(T)$ is a discrete random variable. Its distribution turns out to be Poisson with mean $\lambda T$ :

\boxed{\;\mathrm{Pr}(N(T) = k) \;=\; \frac{(\lambda T)^k\, e^{-\lambda T}}{k!}, \qquad k = 0, 1, 2, \ldots\;}

▶ Poisson as the limit of binomial Derivation

Divide the interval $[0, T]$ into $n$ subintervals of length $\Delta t = T/n$ . In each subinterval, the probability of an event is approximately $p = \lambda \Delta t = \lambda T / n$ (treating $\Delta t$ as small enough that at most one event occurs). The events across subintervals are independent.

The total count $N(T)$ is the number of “successes” in $n$ Bernoulli trials each with probability $p$ — a binomial distribution:

\mathrm{Pr}(N(T) = k) \;=\; \binom{n}{k}\, p^k\, (1 - p)^{n - k}.

Now take the limit $n \to \infty$ with $\lambda T = np$ held fixed (i.e. $p = \lambda T / n \to 0$ ):

\binom{n}{k}\, p^k\, (1 - p)^{n - k} \;=\; \underbrace{\frac{n!}{k!\, (n - k)!}}_{\approx\, n^k / k!} \cdot \underbrace{\left( \frac{\lambda T}{n} \right)^k}_{= (\lambda T)^k / n^k} \cdot \underbrace{\left( 1 - \frac{\lambda T}{n} \right)^{n}}_{\to\, e^{-\lambda T}} \cdot \underbrace{\left( 1 - \frac{\lambda T}{n} \right)^{-k}}_{\to\, 1}.

Putting the limits together:

\mathrm{Pr}(N(T) = k) \;\to\; \frac{(\lambda T)^k\, e^{-\lambda T}}{k!}.

This is the Poisson distribution. It is the natural limit of a thinned binomial as the trials become many and individually unlikely. The result is independent of how you take the limit — the rate $\lambda T$ is the only thing that matters.

Mean and variance of $\mathrm{Poisson}(\lambda T)$ :

\mathbb{E}[N(T)] \;=\; \lambda T, \qquad \mathrm{Var}[N(T)] \;=\; \lambda T.

The mean equals the variance. This is the Poisson’s defining signature: the standard deviation of the count grows as $\sqrt{\lambda T}$ , so the fractional spread $\sigma / \mu = 1/\sqrt{\lambda T}$ shrinks as the count grows. Doubling the observation time halves the fractional uncertainty (but only by $\sqrt{2}$ ).

Exponential inter-arrival times

A Poisson process generates events at random times. The waiting time between consecutive events — the inter-arrival time — has a remarkable property: it is exponentially distributed with mean $1/\lambda$ .

▶ Inter-arrival times are exponential Derivation

Let $T_1$ be the time until the first event. The probability that no event has occurred by time $t$ is the same as $\mathrm{Pr}(N(t) = 0)$ , which by the Poisson formula above is

\mathrm{Pr}(T_1 > t) \;=\; \mathrm{Pr}(N(t) = 0) \;=\; \frac{(\lambda t)^0\, e^{-\lambda t}}{0!} \;=\; e^{-\lambda t}.

The CDF of $T_1$ is

F_{T_1}(t) \;=\; \mathrm{Pr}(T_1 \leq t) \;=\; 1 - e^{-\lambda t}.

Differentiating gives the PDF:

f_{T_1}(t) \;=\; \frac{d F_{T_1}}{dt} \;=\; \lambda\, e^{-\lambda t}, \qquad t \geq 0.

This is the exponential distribution with rate $\lambda$ . By the independence axiom, the same argument applies to every subsequent inter-arrival time: $T_2, T_3, T_4, \ldots$ are all i.i.d. exponential with rate $\lambda$ .

The exponential distribution has the memoryless property: $\mathrm{Pr}(T > s + t \mid T > s) = \mathrm{Pr}(T > t)$ . Given that you’ve already waited $s$ seconds for the next event, the additional waiting time is distributed the same as if you’d just started — the process has no “memory” of how long you’ve already waited. This is the only continuous distribution with this property.

For an auditory-nerve fibre firing at $\lambda = 100$ spikes per second, the mean inter-spike interval is $1/\lambda = 10$ ms. But individual intervals can be much longer or shorter — the distribution is exponential, so a fraction $e^{-1} \approx 37\%$ of intervals exceed the mean, and a fraction $e^{-5} \approx 0.7\%$ exceed five times the mean.

A Poisson process, made visible

rate λ = 5 events/s

A Poisson process at rate λ produces events at independent, uniformly-distributed times — no event "remembers" when the previous one happened (the memoryless property). Three statistical consequences, all visible above. The inter-spike intervals are exponentially distributed with mean 1/λ (the red curve over the left histogram). The number of events in any time window T is Poisson-distributed with mean λT (the red dots over the right histogram). The events themselves cluster and gap unpredictably; the apparent rhythm of a Poisson raster is an artefact of the human visual system, not a property of the process. Used to model radioactive decay, photon arrivals, customer-queue arrivals, and (most relevant for this bookshelf) the spike trains of auditory-nerve fibres in [Hearing Ch 5](/hearing/auditory-nerve).

The top panel is a single 5-second realisation, drawn as a spike raster — each vertical line is an event. The bottom-left histogram is the distribution of inter-spike intervals across many trials, overlaid with the theoretical exponential. The bottom-right histogram is the distribution of spike counts in a 1-second window, overlaid with the theoretical Poisson PMF.

Three things to take from playing with $\lambda$ :

The raster looks bunched. Even at constant rate, Poisson events cluster and gap unpredictably. The brain’s pattern-finding instinct insists there must be a rhythm; there is not. The apparent rhythm is an artefact of human perception, not a property of the process.
Inter-arrival times are exponential. Lots of short intervals, fewer long ones, in the precise shape $\lambda e^{-\lambda t}$ .
Counts per second are Poisson, mean = variance. The histogram is wider for higher $\lambda$ , but always with standard deviation $\sqrt{\lambda}$ — so the relative spread shrinks.

Adding Poisson processes

A useful property: the superposition of two independent Poisson processes with rates $\lambda_1$ and $\lambda_2$ is a Poisson process with rate $\lambda_1 + \lambda_2$ . Conversely, thinning a Poisson process by independently retaining each event with probability $p$ gives a Poisson process with rate $p\lambda$ .

These properties are why Poisson processes are so easy to combine and decompose. If $n$ identical auditory-nerve fibres each fire as a Poisson process at rate $\lambda$ , the total spike count in the population is Poisson at rate $n\lambda$ . If a population of fibres collectively fires at rate $\Lambda$ and we subsample some fraction $p$ of them, the subsample’s spike count is Poisson at rate $p\Lambda$ . Both consequences are central to modelling neural populations and to the population-coding analyses of Hearing Ch 5.

When the Poisson model breaks

The Poisson process is the simplest possible model of stochastic events. Real neural firing is not perfectly Poisson — most real spike trains show:

Refractory periods. After firing, a neuron cannot fire again for several milliseconds. This suppresses short inter-spike intervals below their Poisson prediction.
Rate variation in time. If the underlying firing rate $\lambda(t)$ varies in time (a non-homogeneous Poisson process or a doubly stochastic model), the count statistics show extra variance: $\mathrm{Var}[N] > \mathbb{E}[N]$ , called over-dispersion.
Adaptation. Many neurons fire faster transiently after a stimulus onset and then settle to a lower steady rate. Adaptation produces non-Poisson temporal structure.

The Fano factor $F = \mathrm{Var}[N] / \mathbb{E}[N]$ measures departure from Poisson: $F = 1$ exactly for Poisson, $F < 1$ for refractory-period-dominated regularity, $F > 1$ for over-dispersion. Auditory-nerve fibres typically have $F \approx 1$ at moderate firing rates, dropping below 1 at high rates where the refractory period dominates.

What we use this for

Poisson processes (and the Poisson distribution as their count statistic) are everywhere in noise and event statistics:

Auditory-nerve spike trains (Hearing Ch 5) — Poisson is the baseline; departures from Poisson encode acoustic information.
Photon counting at low light intensities — single-photon-counting detectors record Poisson-distributed counts; the $\sqrt{N}$ shot-noise scaling sets the fundamental noise floor.
Radioactive decay — the original Poisson application. The number of decays in a time interval is exactly Poisson-distributed.
Network queueing — packet arrivals on a TCP connection, customer arrivals at a server. Erlang and Poisson queueing theory is the dominant model.
Random photon emission in spontaneous fluorescence, and photon shot noise in imaging.
The “Poisson approximation” to the binomial — used whenever you have many independent rare events, e.g. radioactive decay counts, mutation counts, defect counts on a chip.

The next lesson, 11.5, develops Bayesian inference and signal detection theory — the inferential machinery that combines a likelihood (often based on Gaussian or Poisson noise from this and previous lessons) with a prior to produce a posterior belief.