4.3 Linear systems: solving Ax = b
The everyday task of linear algebra: given an matrix and an -vector , find the vector such that
Concretely this is a system of linear equations in unknowns. For example,
written as the single matrix equation
There are infinitely many ways to phrase this problem and one canonical algorithm that solves it: Gaussian elimination. The algorithm is simple, deterministic, and is exactly the row-reduction procedure that high-school algebra teaches without ever using the matrix word.
Three things that can happen
Before the algorithm, the structure. A square linear system has one of exactly three outcomes:
- A unique solution. This is the generic case — it happens precisely when is invertible, equivalently when , equivalently when the columns of are linearly independent.
- No solution. The system is inconsistent: the equations contradict each other. Geometrically (for two equations in two unknowns): two parallel lines that never meet. Algebraically: Gaussian elimination produces a row of the form ” nonzero”, which can’t be satisfied.
- Infinitely many solutions. The equations are consistent but redundant: some are linear combinations of others. Geometrically (for three planes in 3-D): they intersect in a line or a plane rather than a point. Algebraically: elimination produces a row of all zeros, and at least one variable is free — it can take any value, with the other variables determined by it.
Cases 2 and 3 both correspond to — a singular matrix. The distinction between them is whether happens to lie in the span of ‘s columns (case 3) or doesn’t (case 2). Gaussian elimination tells you which case you’re in without you having to think about it: it just runs and you read off the answer from the final form.
The algorithm
Gaussian elimination works on the augmented matrix — the matrix with the right-hand-side vector tacked on as an extra column:
Three elementary row operations are allowed, each of which preserves the solution set of the underlying linear system:
- Swap two rows.
- Multiply a row by a non-zero scalar.
- Add a scalar multiple of one row to another.
The procedure is to use these operations to bring the matrix into upper-triangular form (zeros below the diagonal) — also called row echelon form. Once you have that, you read the answer from the bottom up by back substitution.
The standard pattern is: pick the top-left entry as the first pivot, use operation 3 to zero out everything below it in column 1, move to position (2, 2) as the next pivot, zero out below it, and so on. If the pivot position is already zero, swap rows to bring a non-zero entry there first.
| 2 | 1 | -1 | | | 8 |
| -3 | -1 | 2 | | | -11 |
| -2 | 1 | 2 | | | -3 |
Each step performs one elementary row operation — a row swap or a "subtract a multiple of one row from another". The pivot column for the current step is highlighted. Once the matrix is in upper-triangular form (echelon), the solution is read off by back-substitution from the last row up. The inconsistent and dependent-rows presets show the two ways a 3×3 system can fail to have a unique solution: an "0 = nonzero" contradiction (no solution) or a "0 = 0" tautology (a whole line or plane of solutions).
Step through the algorithm on four representative systems: a standard with a unique solution, one that needs a row swap to find a pivot, an inconsistent system that produces a "" contradiction, and a dependent system whose third equation is a linear combination of the first two.
Worked example
▶ Worked example: every step, no shortcuts
The problem. Solve
Step 1 — Write the augmented matrix.
Step 2 — Eliminate the first column below the pivot . We want zeros in positions and .
For row 2: .
For row 3: .
Matrix now:
Step 3 — Eliminate the second column below the pivot . We want a zero in position . Multiplier: .
:
Matrix now in upper-triangular form:
Step 4 — Back-substitute. Read from the bottom up.
Row 3 says , so .
Row 2 says . Substitute : , so .
Row 1 says . Substitute and : , so and .
Step 5 — Final answer.
Step 6 — Sanity-check by substitution.
- ✓
- ✓
- ✓
What if A isn’t square?
If the system has more equations than unknowns (overdetermined) or fewer equations than unknowns (underdetermined), the same elimination machinery still applies, but the answer is no longer a single point.
- Overdetermined systems (more equations than unknowns) typically have no solution, because the equations over-constrain the unknowns. The best we can do is find the that minimises — the least-squares solution, computed using the normal equations . This is the algorithm behind every regression model.
- Underdetermined systems (fewer equations than unknowns) typically have infinitely many solutions, forming an affine subspace. Elimination produces free variables that parameterise the solution space.
Both extensions belong to “applied linear algebra” rather than the working subset this chapter covers. Knowing they exist is enough; the algorithms reduce, under the hood, to the same elimination procedure used here.
Connection to invertibility
Gaussian elimination also tells you whether is invertible without computing the inverse: is invertible if and only if reducing produces without running into a row of zeros on the side. This is the same algorithm as solving , run with replaced by the columns of one at a time.
Computing inverses directly is rarely what one wants in practice — solving by elimination is faster and more numerically stable. The conceptual point (” is invertible has a unique solution for every ”) matters; the practical action — “compute the inverse” — usually does not.
What we use this for
Linear systems are everywhere. A short, representative list:
- Discretised differential equations. Every numerical method for a PDE — finite differences, finite elements, spectral collocation — turns the continuous problem into a large linear system . The methods of Foundations 10 all reduce to “solve this linear system efficiently.”
- Mode amplitudes. In separation of variables (Foundations 6.3), the Fourier-projection step that pins down the coefficients is, when the basis is non-orthogonal, a linear system in the coefficients.
- Circuit analysis, mechanics of trusses, network flow problems, balance laws of every kind. Anything where conserved quantities are accumulated at nodes produces a linear system on those node values.
- Least-squares regression. The “best fit line through these points” is the least-squares solution of an overdetermined linear system.
The next lesson, 4.4 — Eigenvalues and eigenvectors, takes us in a different direction: rather than solving for arbitrary , we ask which special inputs the matrix sends to scalar multiples of themselves. The answer turns out to be the unifying concept across linear algebra.