Kantorovich theorem

{{Short description|About the convergence of Newton's method}}

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.{{cite book |first=P. |last=Deuflhard |title=Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms |publisher=Springer |location=Berlin |year=2004 |isbn=3-540-21099-7 |series=Springer Series in Computational Mathematics |volume=35 }}{{cite book |last=Zeidler |first=E. |year=1985 |title=Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems |location=New York |publisher=Springer |isbn=0-387-96499-1 }} It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.{{cite book |first1=John E. |last1=Dennis |author-link=John E. Dennis |first2=Robert B. |last2=Schnabel |author-link2=Robert B. Schnabel |chapter=The Kantorovich and Contractive Mapping Theorems |title=Numerical Methods for Unconstrained Optimization and Nonlinear Equations |location=Englewood Cliffs |publisher=Prentice-Hall |year=1983 |pages=92–94 |isbn=0-13-627216-9 |chapter-url=https://books.google.com/books?id=ksvJTtJCx9cC&pg=PA92 }}

Newton's method constructs a sequence of points that under certain conditions will converge to a solution x of an equation f(x)=0 or a vector solution of a system of equation F(x)=0. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.

Assumptions

Let X\subset\R^n be an open subset and F:X \subset \R^n \to\R^n a differentiable function with a Jacobian F^{\prime}(\mathbf x) that is locally Lipschitz continuous (for instance if F is twice differentiable). That is, it is assumed that for any x \in X there is an open subset U\subset X such that x \in U and there exists a constant L>0 such that for any \mathbf x,\mathbf y\in U

:\|F'(\mathbf x)-F'(\mathbf y)\|\le L\;\|\mathbf x-\mathbf y\|

holds. The norm on the left is the operator norm. In other words, for any vector \mathbf v\in\R^n the inequality

:\|F'(\mathbf x)(\mathbf v)-F'(\mathbf y)(\mathbf v)\|\le L\;\|\mathbf x-\mathbf y\|\,\|\mathbf v\|

must hold.

Now choose any initial point \mathbf x_0\in X. Assume that F'(\mathbf x_0) is invertible and construct the Newton step \mathbf h_0=-F'(\mathbf x_0)^{-1}F(\mathbf x_0).

The next assumption is that not only the next point \mathbf x_1=\mathbf x_0+\mathbf h_0 but the entire ball B(\mathbf x_1,\|\mathbf h_0\|) is contained inside the set X. Let M be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences (\mathbf x_k)_k, (\mathbf h_k)_k, (\alpha_k)_k according to

:\begin{alignat}{2}

\mathbf h_k&=-F'(\mathbf x_k)^{-1}F(\mathbf x_k)\\[0.4em]

\alpha_k&=M\,\|F'(\mathbf x_k)^{-1}\|\,\|\mathbf h_k\|\\[0.4em]

\mathbf x_{k+1}&=\mathbf x_k+\mathbf h_k.

\end{alignat}

Statement

Now if \alpha_0\le\tfrac12 then

  1. a solution \mathbf x^* of F(\mathbf x^*)=0 exists inside the closed ball \bar B(\mathbf x_1,\|\mathbf h_0\|) and
  2. the Newton iteration starting in \mathbf x_0 converges to \mathbf x^* with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots t^\ast\le t^{**} of the quadratic polynomial

:

p(t)

=\left(\tfrac12L\|F'(\mathbf x_0)^{-1}\|^{-1}\right)t^2

-t+\|\mathbf h_0\|

,

:t^{\ast/**}=\frac{2\|\mathbf h_0\|}{1\pm\sqrt{1-2\alpha_0}}

and their ratio

:

\theta

=\frac{t^*}{t^{**}}

=\frac{1-\sqrt{1-2\alpha_0}}{1+\sqrt{1-2\alpha_0}}.

Then

  1. a solution \mathbf x^* exists inside the closed ball \bar B(\mathbf x_1,\theta\|\mathbf h_0\|)\subset\bar B(\mathbf x_0,t^*)
  2. it is unique inside the bigger ball B(\mathbf x_0,t^{*\ast})
  3. and the convergence to the solution of F is dominated by the convergence of the Newton iteration of the quadratic polynomial p(t) towards its smallest root t^\ast,{{cite journal |first=J. M. |last=Ortega |title=The Newton-Kantorovich Theorem |journal=Amer. Math. Monthly |volume=75 |year=1968 |issue=6 |pages=658–660 |jstor=2313800 |doi=10.2307/2313800}} if t_0=0,\,t_{k+1}=t_k-\tfrac{p(t_k)}{p'(t_k)}, then
  4. :\|\mathbf x_{k+p}-\mathbf x_k\|\le t_{k+p}-t_k.
  5. The quadratic convergence is obtained from the error estimate{{cite journal |first1=W. B. |last1=Gragg |first2=R. A. |last2=Tapia |year=1974 |title=Optimal Error Bounds for the Newton-Kantorovich Theorem |journal=SIAM Journal on Numerical Analysis |volume=11 |issue=1 |pages=10–13 |jstor=2156425 |doi=10.1137/0711002|bibcode=1974SJNA...11...10G }}
  6. :

\|\mathbf x_{n+1}-\mathbf x^*\|

\le \theta^{2^n}\|\mathbf x_{n+1}-\mathbf x_n\|

\le\frac{\theta^{2^n}}{2^n}\|\mathbf h_0\|.

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),{{cite journal |first=A. M. |last=Ostrowski |title=La method de Newton dans les espaces de Banach |journal=C. R. Acad. Sci. Paris |volume=27 |issue=A |year=1971 |pages=1251–1253 }}{{cite book |first=A. M. |last=Ostrowski |title=Solution of Equations in Euclidean and Banach Spaces |publisher=Academic Press |location=New York |year=1973 |isbn=0-12-530260-6 }} Gragg-Tapia (1974), Potra-Ptak (1980),{{cite journal |first1=F. A. |last1=Potra |first2=V. |last2=Ptak |title=Sharp error bounds for Newton's process |journal=Numer. Math. |volume=34 |year=1980 |pages=63–72 |doi=10.1007/BF01463998 }} Miel (1981),{{cite journal |first=G. J. |last=Miel |title=An updated version of the Kantorovich theorem for Newton's method |journal=Computing |volume=27 |issue=3 |year=1981 |pages=237–244 |doi=10.1007/BF02237981 }} Potra (1984),{{cite journal |first=F. A. |last=Potra |title=On the a posteriori error estimates for Newton's method |journal=Beiträge zur Numerische Mathematik |volume=12 |year=1984 |pages=125–138 }} can be derived from the Kantorovich theorem.{{cite journal |last=Yamamoto |first=T. |year=1986 |title=A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions |journal=Numerische Mathematik |volume=49 |issue=2–3 |pages=203–220 |doi=10.1007/BF01389624 }}

Generalizations

There is a q-analog for the Kantorovich theorem.{{cite journal |last1=Rajkovic |first1=P. M. |last2=Stankovic |first2=M. S. |last3=Marinkovic |first3=S. D. |year=2003 |title=On q-iterative methods for solving equations and systems |journal=Novi Sad J. Math |volume=33 |issue=2 |pages=127–137 }}{{cite journal |last1=Rajković |first1=P. M. |last2=Marinković |first2=S. D. |last3=Stanković |first3=M. S. |year=2005 |title=On q-Newton–Kantorovich method for solving systems of equations |journal=Applied Mathematics and Computation |volume=168 |issue=2 |pages=1432–1448 |doi=10.1016/j.amc.2004.10.035 }} For other generalizations/variations, see Ortega & Rheinboldt (1970).{{cite book |last1=Ortega |first1=J. M. |last2=Rheinboldt |first2=W. C. |year=1970 |title=Iterative Solution of Nonlinear Equations in Several Variables |publisher=SIAM |oclc=95021 }}

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.{{cite journal |last1=Oishi |first1=S. |last2=Tanabe |first2=K. |year=2009 |title=Numerical Inclusion of Optimum Point for Linear Programming |journal=JSIAM Letters |volume=1 |pages=5–8 |doi=10.14495/jsiaml.1.5 |doi-access=free }}

References

{{reflist}}

Further reading

  • John H. Hubbard and Barbara Burke Hubbard: Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach, Matrix Editions, {{ISBN|978-0-9715766-3-6}} ([http://matrixeditions.com/UnifiedApproachSamples.html preview of 3. edition and sample material including Kant.-thm.])
  • {{cite book |first=Tetsuro |last=Yamamoto |chapter=Historical Developments in Convergence Analysis for Newton's and Newton-like Methods |pages=241–263 |editor-first=C. |editor-last=Brezinski |editor2-first=L. |editor2-last=Wuytack |title=Numerical Analysis : Historical Developments in the 20th Century |publisher=North-Holland |year=2001 |isbn=0-444-50617-9 }}

Category:Functional analysis

Category:Numerical analysis

Category:Theorems in mathematical analysis

Category:Optimization in vector spaces

Category:Optimization algorithms and methods