4.4 Eigenvalues and eigenvectors

This is the lesson the rest of the bookshelf has been quietly assuming. The “eigenvalues of A” you have already met in the ODE phase plane (Foundations 5.4), in modal analysis of PDEs (Foundations 6.5), in Helmholtz cavity modes (Foundations 6.7), and in the time-independent Schrödinger equation (Foundations 6.8) — all of those are instances of the same idea: a matrix or operator has special directions it merely stretches or contracts rather than rotates. Find those directions, and the linear map’s behaviour becomes transparent.

The defining equation

A non-zero vector v\mathbf{v} is an eigenvector of the matrix AA if it satisfies

Av  =  λvA \mathbf{v} \;=\; \lambda \mathbf{v}

for some scalar λ\lambda. The number λ\lambda is the corresponding eigenvalue. The equation says: when AA acts on v\mathbf{v}, the result is v\mathbf{v} itself scaled by λ\lambda — no rotation, no twisting, just a stretch by factor λ\lambda (or a contraction if λ<1|\lambda| < 1, or a flip if λ<0\lambda < 0).

Most vectors are not eigenvectors: most directions get rotated when AA acts. The eigenvectors are the special directions left invariant up to scaling. Their existence and structure are what make many matrix problems tractable.

Why eigenvectors matter

If AA is an n×nn \times n matrix with nn linearly independent eigenvectors v1,,vn\mathbf{v}_1, \ldots, \mathbf{v}_n with eigenvalues λ1,,λn\lambda_1, \ldots, \lambda_n, then any vector x\mathbf{x} can be written as a linear combination

x  =  c1v1+c2v2++cnvn.\mathbf{x} \;=\; c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_n \mathbf{v}_n.

Applying AA to both sides and using linearity plus the eigenvalue equation:

Ax  =  c1λ1v1+c2λ2v2++cnλnvn.A \mathbf{x} \;=\; c_1 \lambda_1 \mathbf{v}_1 + c_2 \lambda_2 \mathbf{v}_2 + \cdots + c_n \lambda_n \mathbf{v}_n.

The action of AA on x\mathbf{x} has become componentwise multiplication in the eigenvector basis — each component ckc_k just gets multiplied by its eigenvalue λk\lambda_k. This is the diagonalisation trick: in the right basis, AA looks like a diagonal matrix, and diagonal matrices are trivial to power, exponentiate, or invert.

This is the deep mathematical reason mode expansions work in PDEs (Foundations 6.5). The modes are the eigenfunctions of the spatial differential operator; the operator acts on a sum-over-modes by multiplying each mode by its eigenvalue. The same algebraic move that makes a finite-dimensional matrix easy to handle makes an infinite-dimensional linear differential operator solvable.

Finding eigenvalues

To find the eigenvalues of a given matrix AA, rearrange the defining equation:

Av=λv(AλI)v=0.A \mathbf{v} = \lambda \mathbf{v} \quad\Longleftrightarrow\quad (A - \lambda I) \mathbf{v} = \mathbf{0}.

We want this to have a non-zero solution v\mathbf{v} (eigenvectors aren’t allowed to be the zero vector — that would be trivial). A square linear system has a non-zero solution exactly when its matrix is singular, i.e. has determinant zero. So the eigenvalues are the roots of

  det(AλI)  =  0,  \boxed{\;\det(A - \lambda I) \;=\; 0,\;}

which is a polynomial equation in λ\lambda of degree nn — the characteristic polynomial of AA. An n×nn \times n matrix therefore has at most nn eigenvalues (counted with multiplicity). Once each eigenvalue is known, the corresponding eigenvector is found by solving the homogeneous linear system (AλI)v=0(A - \lambda I) \mathbf{v} = \mathbf{0} — Gaussian elimination from 4.3, with a singular matrix on purpose.

For a 2×22 \times 2 matrix the characteristic polynomial is a quadratic:

det(AλI)  =  λ2(trA)λ+detA,\det(A - \lambda I) \;=\; \lambda^2 - (\mathrm{tr}\, A)\, \lambda + \det A,

where trA=a11+a22\mathrm{tr}\, A = a_{11} + a_{22} is the trace (sum of diagonal entries). The quadratic formula gives the two eigenvalues directly:

λ±  =  trA2±(trA2)2detA.\lambda_{\pm} \;=\; \frac{\mathrm{tr}\,A}{2} \pm \sqrt{\left(\frac{\mathrm{tr}\,A}{2}\right)^2 - \det A}.

This is exactly the formula that appears in the ODE phase-plane analysis with AA as the linearised flow matrix; the sign of the discriminant determines whether the eigenvalues are real distinct (overdamped), real repeated (critical), or complex conjugate (underdamped, spirals).

Worked example: a 2x2 matrix by hand

Worked example: every step, no shortcuts

The problem. Find the eigenvalues and eigenvectors of

A  =  (3113).A \;=\; \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}.

Step 1 — Set up the characteristic equation. AλIA - \lambda I is

(3λ113λ),\begin{pmatrix} 3 - \lambda & 1 \\ 1 & 3 - \lambda \end{pmatrix},

and the determinant is

det(AλI)  =  (3λ)211  =  (3λ)21.\det(A - \lambda I) \;=\; (3 - \lambda)^2 - 1 \cdot 1 \;=\; (3 - \lambda)^2 - 1.

Step 2 — Solve. Expand (3λ)21=λ26λ+91=λ26λ+8=0(3 - \lambda)^2 - 1 = \lambda^2 - 6\lambda + 9 - 1 = \lambda^2 - 6\lambda + 8 = 0. Quadratic formula or factoring:

λ26λ+8=(λ4)(λ2)=0.\lambda^2 - 6\lambda + 8 = (\lambda - 4)(\lambda - 2) = 0.

So λ1=4\lambda_1 = 4 and λ2=2\lambda_2 = 2. Two distinct real eigenvalues.

(Cross-check: trace = 3+3=6=λ1+λ23 + 3 = 6 = \lambda_1 + \lambda_2. Determinant = 91=8=λ1λ29 - 1 = 8 = \lambda_1 \lambda_2. ✓ The trace always equals the sum of eigenvalues; the determinant always equals their product.)

Step 3 — Find the eigenvector for λ1=4\lambda_1 = 4. Solve (A4I)v=0(A - 4 I) \mathbf{v} = \mathbf{0}:

A4I  =  (1111).A - 4 I \;=\; \begin{pmatrix} -1 & 1 \\ 1 & -1 \end{pmatrix}.

The two rows are negatives of each other (as expected — the matrix is singular). The single non-trivial equation is v1+v2=0-v_1 + v_2 = 0, i.e. v2=v1v_2 = v_1. Any non-zero v=(1,1)\mathbf{v} = (1, 1) works. Conventionally we normalise:

v1  =  12(11).\mathbf{v}_1 \;=\; \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}.

Step 4 — Find the eigenvector for λ2=2\lambda_2 = 2. Solve (A2I)v=0(A - 2 I) \mathbf{v} = \mathbf{0}:

A2I  =  (1111).A - 2 I \;=\; \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}.

The non-trivial equation is v1+v2=0v_1 + v_2 = 0, i.e. v2=v1v_2 = -v_1. Take v=(1,1)\mathbf{v} = (1, -1) and normalise:

v2  =  12(11).\mathbf{v}_2 \;=\; \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}.

Step 5 — Verify. Check that Av1=λ1v1A \mathbf{v}_1 = \lambda_1 \mathbf{v}_1:

(3113)12(11)=12(44)=412(11)=λ1v1.  \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 4 \\ 4 \end{pmatrix} = 4 \cdot \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \lambda_1 \mathbf{v}_1. \;\checkmark

And Av2=λ2v2A \mathbf{v}_2 = \lambda_2 \mathbf{v}_2:

(3113)12(11)=12(22)=212(11)=λ2v2.  \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 2 \\ -2 \end{pmatrix} = 2 \cdot \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix} = \lambda_2 \mathbf{v}_2. \;\checkmark

Step 6 — Notice. The two eigenvectors are orthogonal: v1v2=12(11+1(1))=0\mathbf{v}_1 \cdot \mathbf{v}_2 = \tfrac12 (1 \cdot 1 + 1 \cdot (-1)) = 0. This is not a coincidence — the matrix AA here happens to be symmetric (AT=AA^T = A), and symmetric matrices always have orthogonal eigenvectors. That fact is the spectral theorem for symmetric matrices, which we will meet in full generality in 4.6. It is what makes self-adjoint operators in PDEs so well-behaved.

When eigenvalues are complex

Not every matrix has real eigenvalues. The matrix

R  =  (0110)R \;=\; \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}

is a 90°90° rotation. Its characteristic polynomial is λ2+1=0\lambda^2 + 1 = 0, with roots λ=±i\lambda = \pm i. No real direction is preserved under a 90°90° rotation — every arrow turns ninety degrees. The eigenvalues are complex, and the corresponding eigenvectors live in C2\mathbb{C}^2 rather than R2\mathbb{R}^2. The complex eigenvectors are useful: they let you express the rotation as componentwise complex multiplication in the right basis, recovering the connection to complex exponentials from Foundations 3.

In acoustics and elsewhere, complex eigenvalues typically signal oscillation — they appear in the underdamped regime of the damped oscillator, in the spiral fixed points of Foundations 5.4, and in any time-evolution problem with rotational character.

Power iteration: finding eigenvectors numerically

For large matrices the characteristic polynomial is impractical to solve directly. The standard numerical method instead is power iteration: pick an arbitrary unit vector v0\mathbf{v}_0 and repeatedly apply AA, renormalising at each step:

vk+1  =  AvkAvk.\mathbf{v}_{k+1} \;=\; \frac{A \mathbf{v}_k}{\|A \mathbf{v}_k\|}.

Generically, vk\mathbf{v}_k converges to the eigenvector associated with the eigenvalue of largest absolute value (the dominant eigenvalue). The intuition: writing v0=civi\mathbf{v}_0 = \sum c_i \mathbf{v}_i in the eigenvector basis, applying AA many times gives Akv0=ciλikviA^k \mathbf{v}_0 = \sum c_i \lambda_i^k \mathbf{v}_i. Whichever λi\lambda_i is largest in absolute value, its term grows fastest, and after enough iterations the sum is dominated by that one mode.

Once the dominant eigenvector is known, the dominant eigenvalue follows from the Rayleigh quotient vTAv/vTv\mathbf{v}^T A \mathbf{v} / \mathbf{v}^T \mathbf{v}, which approaches the dominant eigenvalue as vk\mathbf{v}_k approaches its eigenvector.

v_0iterationk = 0current vectorv_0 = (0.643, 0.766)rayleigh quotientvᵀA v / vᵀv = 1.906true eigenvaluesλ₁ = 2.207, λ₂ = 0.793dominant eigvec ≈ (-0.924, -0.383)
preset:

Each click of iterate applies A to the current unit vector and renormalises. Generically the result converges to the dominant eigenvector — drawn as the dashed gold line for reference when the eigenvalues are real. The Rayleigh quotient vᵀA v / vᵀv approaches the dominant eigenvalue at the same time. The complex eigenvalues preset shows what happens with no real eigenvector to attract toward: the vector spins around indefinitely. The shear (degenerate) preset has a single repeated eigenvalue and one eigenvector — convergence still happens but is slower.

Pick a matrix preset, set the starting direction θ0\theta_0, and step the iteration. Watch the vector swing toward the dominant eigenvector (the dashed gold line). The Rayleigh quotient on the side converges to the dominant eigenvalue in parallel. The complex eigenvalues preset shows what happens when no real eigenvector exists — the iteration rotates rather than converging.

Power iteration is the simplest of a family of iterative eigenvalue algorithms that scale to large matrices. The PageRank algorithm Google used to launch in 1998 is power iteration on a billion-by-billion link matrix; the largest eigenvalue’s eigenvector ranks web pages by importance. Modal analysis in audio, image-compression’s principal component analysis (PCA), and many machine-learning algorithms all reduce, at the bottom, to “find the dominant eigenvectors of this matrix.”

The history — From Cayley to Hilbert: a century building the spectral theorem

Matrix algebra as we know it was assembled by Arthur Cayley and James Joseph Sylvester in the 1850s in England. Cayley’s 1858 Memoir on the Theory of Matrices defined matrix addition, multiplication, and the characteristic polynomial — the equation det(AλI)=0\det(A - \lambda I) = 0 from this lesson. Sylvester coined the word “matrix” in 1850 and introduced “discriminant” and “minor” along with much of the modern vocabulary. The two were friends and lifelong correspondents; the era is sometimes called the Cayley–Sylvester period of algebra.

The eigenvalue–eigenvector machinery was fully understood for finite matrices by the 1880s. The leap to infinite dimensions — operators on function spaces, the natural home of PDEs and quantum mechanics — was made by David Hilbert in the early 1900s, in his work on integral equations. Hilbert’s six papers from 1904–1910 established what we now call Hilbert space, and the proof that self-adjoint operators on a Hilbert space have a complete orthonormal eigenbasis is the spectral theorem, the deepest result in the chain. The full machinery was reformulated and extended by Hilbert’s student John von Neumann in the 1930s, providing the mathematical foundation that Werner Heisenberg’s matrix mechanics and Erwin Schrödinger’s wave mechanics needed to be the same theory. Eigenvalues, in other words, ran the central arc of mathematical physics from 1850 to 1930.

What we use this for

A short tour of where eigenvalues appear in the bookshelf:

Whenever you see “in the right basis the problem becomes diagonal,” that is eigenvalue analysis happening. The next lesson, 4.5, develops the inner product structure that makes the “right basis” of orthogonal eigenvectors usable for projection — the formal underpinning of Fourier expansion and modal decomposition.