Preface #
In practical problems, obtaining exact or analytical solutions to systems of algebraic, differential, or integral equations is often challenging, especially when nonlinearities are involved. As a result, iterative methods are commonly used to generate approximate solutions. For example, solving the equation $F(x) = 0$
can be reformulated as solving an equivalent equation of the form:
$$x=x-\alpha F(x)=f(x). \tag{1} \label{eq1}$$
The solution $\tilde{x}$
to Equation $\eqref{eq1}$
is defined as a fixed point
of $f$
, indicating that the point is mapped to itself by $f$
, i.e., $f(\tilde{x})=\tilde{x}$
. Therefore, solving $F(x) = 0$
is equivalent to using the fixed-point iteration $x_{k+1}=f(x_k)$
to find a fixed point of a suitably defined $f$
.
Contractions and the Contraction Theorem #
Definition 1 (Contraction). Given a metric space
$\left(X,d\right)$
. Then a mapping$T: X\mapsto X$
is called a contraction if$\exists \alpha\in (0,1)$
such that$$d\left(T(x),T(y)\right)\leq\alpha d\left(x,y\right),\ \forall x,y\in X. \tag{2} \label{eq2}$$
Geometrically, this means that the images $T(x)$
and $T(y)$
are closer under the metric $d$
together than the points $x$
and $y$
.
For any given $\varepsilon>0$
, there exists $\delta(\varepsilon)=\varepsilon/\alpha>0$
such that $\forall x,y\in X$
, if $d\left(x,y\right)<\delta$
, then $d\left(T(x),T(y)\right)\leq\alpha d\left(x,y\right)=\frac{\varepsilon}{\delta}d\left(x,y\right)<\varepsilon$
. Since $\delta=\varepsilon/\alpha$
is independent of $x$
, the contraction is uniform continuous
.
Theorem 1 (Banach Fixed-Point Theorem or Contraction Theorem). Let
$\left(X,d\right)$
be a complete metric space with a contraction mapping$T: X\mapsto X$
. Then$T$
admits a unique fixed point$\tilde{x}\in X$
, and for any$x\in X$
, the sequence$$x,T(x),T^2(x),\cdots,T^k(x),\cdots$$
converges to the fixed point$\tilde{x}$
; that is,$$\lim_{k\to\infty}T^k(x)=\tilde{x}. \tag{3} \label{eq3}$$
Proof. First, we show the uniqueness of the fixed point. Suppose that there exist two distinct fixed points $x,y\in X$
such that $x=T(x),\ y=T(y),\ \text{and }x\neq y$
. Then, since $T$
is a contraction ($\alpha\in (0,1)$
), we have
$$d\left(T(x),T(y)\right)\leq\alpha d\left(x,y\right)<d\left(x,y\right). \tag{4} \label{eq4}$$
However, for these two fixed points, $d\left(T(x),T(y)\right)=d\left(x,y\right)$
, which contradicts Equation $\eqref{eq4}$
. Therefore, the fixed point must be unique.
Next, we proceed to prove the existence of the unique fixed point. Let $x_0\in X$
be an arbitrary point, and in general, $T(x_0)\neq x_0$
. We may consider constructing a sequence of functions $\{x_k\}\subset X$
recursively by
$$x_{k+1}=T(x_k)\text{ or }x_{k}=T^k(x_0),\quad\text{for }k=0,1,2,\,\cdots. \tag{5} \label{eq5}$$
In this way, our goal reduces to showing that ${x_k}$
is a Cauchy sequence
in $X$
. Suppose $m_1$
and $m_2$
are integers with $m_1<m_2$
, then it follows that
\begin{align} d\left(x_{m_1},x_{m_2}\right)&=d\left(T(x_{m_1-1}),T(x_{m_2-1})\right)\\ &\leq\alpha d\left(x_{m_1-1},x_{m_2-1}\right)\\ &=\,\cdots=\alpha^{r-1} d\left(x_{m_1-r},x_{m_2-r}\right)\\ &\leq\alpha^r d\left(x_{m_1-r},x_{m_2-r}\right). \tag{6} \label{eq6} \end{align}
Choose $r=m_1-1$
, Equation $\eqref{eq6}$
becomes
\begin{align} d\left(x_{m_1},x_{m_2}\right)&\leq\alpha^{m_1-1} d\left(x_1,x_{m_2-(m_1-1)}\right)\\ &=\alpha^{m_1-1} d\left(T(x_0),T(x_{m_2-m_1})\right)\\ &\leq\alpha^{m_1} d\left(x_0,x_{m_2-m_1}\right). \tag{7} \label{eq7} \end{align}
By repeatedly applying the triangle inequality
, we get
\begin{align} d\left(x_0,x_{m_2-m_1}\right)&\leq d\left(x_0,x_1\right)+d\left(x_1,x_{m_2-m_1}\right)\\ &\leq d\left(x_0,x_1\right)+\cdots+d\left(x_{m_2-m_1-1},x_{m_2-m_1}\right)\\ &\leq d\left(x_0,x_1\right)+\alpha d\left(x_0,x_1\right)+\cdots+\alpha^{m_2-m_1-1}d\left(x_0,x_1\right). \tag{8} \label{eq8} \end{align}
Note that, the second inequality is derived based on Equation $\eqref{eq7}$
. As long as $0\leq m_1<m_2$
, there is
\begin{align} d\left(x_{m_1},x_{m_2}\right)&\leq\alpha^{m_1}\left[d\left(x_0,x_1\right)+\alpha d\left(x_0,x_1\right)+\cdots+\alpha^{m_2-m_1-1}d\left(x_0,x_1\right)\right]\\ &=\alpha^{m_1}d\left(x_0,x_1\right)\frac{1-\alpha^{m_2-m_1}}{1-\alpha}\leq\alpha^{m_1}d\left(x_0,x_1\right)\frac{1-\alpha^{m_2}}{1-\alpha}. \tag{9} \label{eq9} \end{align}
Since the sequence $\{\alpha^k\}$
converges to zero (with $\alpha\in (0,1)$
), for any $\varepsilon>0$
, there exists $N\in\mathbb{N}$
such that for all $k>N$
, $\alpha^k<\varepsilon (1-\alpha)/d\left(x_0,x_1\right)$
. Then, for $m>n>N$
, we have
$$d\left(x_n,x_m\right)<\frac{\alpha^n}{1-\alpha}d\left(x_0,x_1\right)<\varepsilon. \tag{10} \label{eq10}$$
Thus, $\{x_k\}\subset X$
is a Cauchy sequence with a limit $\tilde{x}\in X$
, by the completeness of the metric space $(X,d)$
. We now recall Equation $\eqref{eq3}$
to show that this limit $\tilde{x}$
is indeed the fixed point of the contraction $T$
. Specifically, the equation
$$\lim_{k\to\infty}T(x_k)=\lim_{k+1\to\infty}x_{k+1}=\tilde{x}=T(\tilde{x}), \tag{11} \label{eq11}$$
demonstrates that the limit $\tilde{x}$
satisfies $\tilde{x}=T(\tilde{x})$
, and hence is the unique fixed point of the mapping $T$
. Done.
From the above proof, we can further derive the error bound associated with the iterative process given by the Contraction Theorem. That is,
$$e_n=\lim_{m\to\infty}d\left(x_n,x_m\right)=d\left(x_n,\tilde{x}\right)\leq\frac{\alpha^n}{1-\alpha}d\left(x_0,T(x_0)\right). \tag{12} \label{eq12}$$
Applications of the Contraction Theorem #
The abovementioned Banach Fixed-Point Theorem states sufficient conditions for the existence and uniqueness of a fixed point, which constitutes the foundation for justifying the iterative solution of a broad class of equations. Below, we will discuss in detail the specific applications of this theorem in finding the roots of (nonlinear) algebraic and differential equations.
Existence and Uniqueness of Solutions to Nonlinear Algebraic Equations #
Let $f: [a,b]\mapsto [a,b]$
be a mapping, differentiable on the open interval $(a,b)$
, and suppose that for all $x\in (a,b)$
, $|f^\prime(x)|\leq\alpha<1$
. By the Mean Value Theorem
, for any $x,y\in [a,b]$
, there exists some $\xi$
between $x$
and $y$
such that
$$f(x)-f(y)=f^\prime(\xi)(x-y). \tag{13} \label{eq13}$$
We equip the real space $[a,b]$
with a metric $d\left(x,y\right)=|x-y|$
. Then, the equation
$$|f(x)-f(y)|=|f^\prime(\xi)||x-y|\leq\alpha d\left(x,y\right) \tag{14} \label{eq14}$$
shows that $f$
is a contraction in $[a,b]$
. Based on the Contraction Theorem (Theorem 1), there exists a unique fixed point $\tilde{x}\in [a,b]$
, i.e., $f(\tilde{x})=\tilde{x}$
.
Let $F: [a,b]\mapsto\mathbb{R}$
be twice continuously differentiable, and suppose $F(\tilde{x})=0$
for some $\tilde{x}\in (a,b)$
with $F^\prime(\tilde{x})\neq 0$
. For a nonlinear equation $F(x)=0$
, the Newton-Raphson method
provides a means of linearization to iteratively approximate its roots. The Taylor expansion of $F(x)$
about a point $x_0$
is given by
$$F(x)=F(x_0)+F^\prime(x_0)(x-x_0)+\frac{F^{\prime\prime}}{2!}(x-x_0)^2+\mathcal{O}\left((x-x_0)^3\right). \tag{15} \label{eq15}$$
Using the first two linear terms as an approximation, we derive the iterative formula of Newton’s method, i.e.,
$$x_{k+1}=x_k-\frac{F(x_k)}{F^\prime(x_k)}:=f(x_k),\quad k=0,1,\cdots. \tag{16} \label{eq16}$$
To verify the convergence of the Newton–Raphson method, two conditions must be checked: invariance of interval and contraction. Both necessitate detailed analysis of the function $f(x)$
, whose derivative is given by
$$f^\prime(x)=\frac{\mathrm{d}}{\mathrm{d}x}\left(x-\frac{F(x)}{F^\prime(x)}\right)=\frac{F(x)F^{\prime\prime}(x)}{\left[F^\prime(x)\right]^2}. \tag{17} \label{eq17}$$
In a small neighborhood $I=[\tilde{x}-\delta,\tilde{x}+\delta]\subset [a,b]$
around the solution $\tilde{x}$
, we have $|F^\prime(x)|\geq m>0$
and $|F^{\prime\prime}(x)|\leq M$
. Again, with the Mean Value Theorem, for any $x\in I$
, there exists some $\xi$
such that
$$|F(x)-F(\tilde{x})|=|F^\prime(\xi)|\cdot|x-\tilde{x}|\leq L|x-\tilde{x}|\leq L\delta,\quad L=\max_{x\in I}|F^\prime(x)|. \tag{18} \label{eq18}$$
In this way, $|f^\prime(x)|$
can be bounded based on Equation $\eqref{eq17}$
:
$$|f^\prime(x)|\leq\frac{ML\delta}{m^2}, \tag{19} \label{eq19}$$
for relativelty small $\delta$
, specifically, $\delta<m^2/(ML)$
, $|f^\prime(x)|<1$
. Therefore, $f$
is a contraction on $I$
.
We also need $f$
to map $I$
to itself. For any $x\in I\ (|x-\tilde{x}|\leq\delta)$
,
\begin{align} \left|f(x)-\tilde{x}\right|&=\left|x-\tilde{x}-\frac{F(x)}{F^\prime(x)}\right|\\ &=\left|x-\tilde{x}-\frac{F^\prime(\tilde{x})(x-\tilde{x})+\frac{1}{2}F^{\prime\prime}(\xi)(x-\tilde{x})^2}{F^\prime(x)}\right|\\ &\leq |x-\tilde{x}|\cdot \left|1-\frac{F^\prime(\tilde{x})}{F^\prime(x)}\right|+\frac{1}{2}\cdot\left|\frac{F^{\prime\prime}(\xi)}{F^\prime(x)}\right|\left(x-\tilde{x}\right)^2\\ &\leq \epsilon\delta+\frac{M\delta^2}{2m}, \tag{20} \label{eq20} \end{align}
where $\epsilon:=\max_{x\in I}\left|1-F^\prime(\tilde{x})/F^\prime(x)\right|$
. If we choose $\delta\leq 2m(1-\epsilon)/M$
, then
$$\left|f(x)-\tilde{x}\right|\leq\delta\Longrightarrow f(I)\subset I, \tag{21} \label{eq21}$$
meaning that $f$
satisfies the invariance of interval condition on $I=[\tilde{x}-\delta,\tilde{x}+\delta]$
. In conclusion, if
$$\delta<\min\left\{\frac{m^2}{ML},\frac{2m(1-\epsilon)}{M}\right\},$$
the invariance of interval and contraction conditions (Equations $\eqref{eq19}$
and $\eqref{eq21}$
) can be satisfied simultaneously, and the Newton’s method can effectively capture the fixed point of the original nonlinear equation.
It is worth noting that our previous proof of the convergence of Newton method was established within a small neighborhood $[\tilde{x}-\delta,\tilde{x}+\delta]$
around the fixed point, rather than over the entire space. This highlights the local nature of the Newton-Raphson method: the initial point $x_0$
must be sufficiently close to the solution $\tilde{x}$
, otherwise the iteration may diverge.
Existence and Uniqueness of Solutions to Nonlinear ODEs #
Consider a nonlinear ordinary differential equation (ODE) of the form:
$$\frac{\mathrm{d}q(t)}{\mathrm{d}t}=F\left[t,q(t)\right],\quad 0<t\leq T, \tag{22} \label{eq22}$$
where $q(t)\in\mathcal{C}^1\left[0,T\right]$
and $q(0)=q_0$
. The function $F(t,q)$
is continuous in $t$
and $q$
, and is Lipschitz continuous in the variable $q$
. That is, there exists a constant $M>0$
such that
$$\left|F(t,q_1)-F(t,q_2)\right|\leq M|q_1(t)-q_2(t)|,\quad\forall t\in [0,T]. \tag{23} \label{eq23}$$
The continuity of $F(t,q)$
on a bounded domain implies the existence of another constant $\tau>0$
and a bounded set containing $(0,q_0)$
such that $\left|F(t,q)\right|\leq\tau$
.
Choose a time interval $[0,T]$
such that $TM<1$
, and consider the doman defined by $|t|\leq T$
and $|q-q_0|\leq\tau T$
(a small neighborhood around $q_0$
). Let $\tilde{\mathcal{C}}$
denote the space of continuous functions on $|t|\leq T$
satisfying $|q(t)-q_0|\leq\tau T$
, equipped with the supremum distance: $d\left(\phi_1,\phi_2\right)=\max_{t\in [0,T]}\left|\phi_1(t)-\phi_2(t)\right|$
. Note that the space $\tilde{\mathcal{C}}\subset\mathcal{C}[0,T]$
is a complete metric space.
Distinguishing between the Spaces
$\mathcal{C}$
and$\tilde{\mathcal{C}}$
.
$\mathcal{C}[0,T]$
: The complete metric space of all continuous functions on the interval$[0,T]$
;
$\tilde{\mathcal{C}}$
: A closed subspace of$\mathcal{C}[0,T]$
, consisting of continuous functions satisfying the uniform bound$|q(t)-q_0|\leq\tau T$
.
Transforming the original ODE into the equivalent integral equation to form a fixed-point problem, we get
$$q(t)=q_0+\int_0^t F\left[\xi,q(\xi)\right]\,\mathrm{d}\xi. \tag{24} \label{eq24}$$
Define an iterative sequence:
$$q_0(t)=q_0,\ q_{k+1}=q_0+\int_0^t F\left[\xi,q_k(\xi)\right]\,\mathrm{d}\xi,\quad k=0,1,2,\cdots. \tag{25} \label{eq25}$$
This iterative process for solving such nonlinear differential equations is known as Picard’s method
. If the mapping is a contraction, the sequence will converge to the unique solution of the original differential equation.
Now we show that the mapping is indeed a contraction. Define for $q\in\tilde{\mathcal{C}}$
and $|t|\leq T$
,
$$\phi(t)=q_0+\int_0^t F\left[\xi,q(\xi)\right]\,\mathrm{d}\xi, \tag{26} \label{eq26}$$
then,
$$\left|\phi(t)-q_0\right|=\left|\int_0^t F\left[\xi,q(\xi)\right]\,\mathrm{d}\xi\right|\leq\int_0^t\left|F\left[\xi,q(\xi)\right]\right|\,\mathrm{d}\xi\leq\tau T, \tag{27} \label{eq27}$$
so that $\phi(\tilde{\mathcal{C}})\subset\tilde{\mathcal{C}}$
. Furthermore, since $F(t,q)$
satisfies the Lipschitz condition in $q$
, we have
\begin{align} \left|\phi_1(t)-\phi_2(t)\right|&\leq\int_0^t\left|F\left[\xi,q_1(\xi)\right]-F\left[\xi,q_2(\xi)\right]\right|\,\mathrm{d}\xi\\ &\leq\int_0^t M\left|q_1(\xi)-q_2(\xi)\right|\,\mathrm{d}\xi. \tag{28} \label{eq28} \end{align}
Taking the supremum over $t\in [0,T]$
, we obtain
$$d\left(\phi_1,\phi_2\right)\leq MT\cdot d\left(q_1,q_2\right). \tag{29} \label{eq29}$$
Since $MT<1$
, the mapping $\phi$
is a contraction on the complete metric space $\tilde{\mathcal{C}}$
. By the Banach Fixed-Point Theorem, the integral equation (and hence the original ODE) has a unique solution in $\tilde{\mathcal{C}}$
.
Last modified on 2025-07-28