10.6 Root-finding: bisection and Newton’s method
Many numerical tasks reduce to finding a root of a function: given , find such that . The eigenvalues of a matrix are the roots of its characteristic polynomial. The natural frequencies of a damped oscillator are the roots of . The intersection of two curves is the root of their difference. The implicit equations that arise in numerical PDE-solvers (every step of every implicit-time-stepping scheme) need a root-finder.
This lesson covers the two basic algorithms — bisection (robust, slow) and Newton’s method (fast, fragile) — and points at the suite of safeguarded methods that combine both in production code.
Bisection: robust and slow
Bisection requires a function that is continuous on an interval with opposite signs at the endpoints: . The intermediate value theorem guarantees a root somewhere in . The algorithm:
- Evaluate at the midpoint .
- If has the same sign as , the root is in — set .
- Otherwise the root is in — set .
- Stop when .
Each step halves the interval, so after steps the root is bracketed to within . To reduce the bracket by a factor of takes iterations — about 50 to reach machine precision starting from any reasonable bracket.
Bisection is robust: it always converges, and the convergence is monotone — every step shrinks the bracket. The downside is that each step gains only one bit of precision. For high-precision work or for functions where evaluations are expensive, bisection alone is too slow.
Newton’s method: fast and fragile
Newton’s method uses the derivative to take much bigger steps. Start with a guess near the root. Linearise around — replace by its tangent line — and solve for where that tangent crosses zero. Use the result as the next guess. Repeat.
The tangent line at is . Setting and solving for :
This is Newton’s iteration. Each step costs one evaluation and one evaluation. If is hard to compute analytically there’s a closely-related secant method that uses a finite-difference estimate of .
Click anywhere along the curve to set the starting point x₀. The red lines are the tangents at each iterate; Newton's method linearises f at x and jumps to where that tangent meets the x-axis. Where the iterates land on a true root they turn green. For x² − 2 the method converges quadratically from almost any start. For the cubic, a poorly-chosen start can oscillate or land on the wrong root. The logistic example has three roots and exhibits extreme sensitivity to x₀: tiny shifts in starting position send the iterates to different roots — the boundary between basins of attraction is fractal.
Click anywhere on the curve to set the starting point. The red lines are the tangents at each iterate; Newton’s method linearises at and jumps to where that tangent meets the -axis. With a good starting point, Newton finds the root in a handful of iterations — but with a bad starting point, the iterates can wander to a different root, oscillate, or diverge. The logistic preset shows the most dramatic version: three roots close together, and tiny shifts in send the iterates to completely different roots.
Quadratic convergence
The reason Newton is fast: each step squares the error. If , then
Taylor-expand around : , and . Plugging into the iteration and simplifying:
The error at step is proportional to the square of the error at step . This is quadratic convergence. The number of correct digits doubles with each iteration. Starting from an error of , you get errors — machine precision after four iterations.
By contrast, bisection’s convergence is linear: each step multiplies the error by , so errors go — needing iterations to reach .
Worked example
▶ Worked example: Newton's method on f(x) = x² − 2
The problem. Find by solving .
Newton’s iteration: , so
Start with .
| | | | error | |---|---|---|---| | 0 | 1.000000000000 | | 0.4142136 | | 1 | 1.500000000000 | | 0.0857864 | | 2 | 1.416666666667 | | 0.0024531 | | 3 | 1.414215686275 | | | | 4 | 1.414213562374 | | | | 5 | 1.414213562373 | | 0 |
Four iterations to 12-digit precision, five to machine precision. The error at each step is roughly the square of the previous error — classic quadratic convergence.
This iteration is the Babylonian method for square roots, known for ~4000 years. Newton’s method generalises it.
When Newton fails
Newton’s quadratic convergence is real but only kicks in near the root, with a smooth that has . Failure modes:
- Bad starting point. Newton finds a root, but maybe not the one you want. Or the iterates jump out of the domain. Or they oscillate around an inflection point.
- near zero. The denominator in the update vanishes. The iterate flies off to infinity. Happens near local minima/maxima of .
- Multiple roots (). Convergence degrades from quadratic to linear, and accuracy suffers.
- Chaotic basins of attraction. When has multiple roots, the set of starting points that converges to each root can be fractal (Newton fractals). The logistic preset in the interactive demonstrates this.
The standard production-grade root-finder is safeguarded Newton: take a Newton step if it lands inside a bracket where the function changes sign, fall back to bisection if it doesn’t. Brent’s method is the standard implementation; it’s what scipy.optimize.brentq runs. It has Newton’s quadratic convergence when conditions are good and bisection’s guaranteed convergence when they aren’t.
Higher-dimensional Newton
For systems of nonlinear equations with , Newton generalises:
where is the Jacobian matrix of partial derivatives. Each Newton step now requires (1) computing the Jacobian and (2) solving a linear system — which uses Gaussian elimination from Foundations 4.3. For large systems the linear-solve cost dominates, and quasi-Newton methods (BFGS and friends) approximate the Jacobian to avoid recomputing it from scratch each step.
This is the algorithm behind every implicit time-stepping scheme for stiff ODEs and parabolic PDEs: at each time step the recurrence is implicit in , so a Newton solve happens inside the time loop. It’s also the algorithm behind much of nonlinear optimisation, because finding a local minimum of is equivalent to solving .
What we use this for
The bookshelf’s interactives invoke root-finding more often than is obvious:
- CharacteristicTrickCarousel (MATH-28) — finds eigenvalues of the characteristic polynomial. For 2×2 this is the quadratic formula; for larger it would be a numerical polynomial-root algorithm like Jenkins–Traub or QR iteration on the companion matrix.
- FixedPointGallery (MATH-29) — eigenvalues of 2×2 matrices, again by the quadratic formula.
- Implicit time-stepping for stiff systems (not currently used in any interactive but the standard tool for real simulations).
Closing the chapter
This lesson closes Foundations 10 and the Math Foundations book in its current form. The chapter started with the question “how do simulations actually work?” and ended with the algorithms that produce numerical answers from continuous equations. The arc, in one paragraph: every continuous quantity becomes a finite array of numbers (10.1); time-marching schemes integrate ODEs and PDEs (10.2, 10.3); iterative averaging solves elliptic problems (10.4); the FFT computes frequency-domain transforms in (10.5); and Newton’s method finds roots and ties everything together when nonlinear systems show up.
Every interactive on this site is one of these algorithms running inside the browser. The point of the chapter is that none of them are black boxes — they are all techniques you can re-implement yourself in fifty lines of code.