implicit function theorem

{{short description|On converting relations to functions of several real variables}}

{{Calculus |expanded=multivariable}}

In multivariable calculus, the implicit function theorem{{efn|Also called Dini's theorem by the Pisan school in Italy. In the English-language literature, Dini's theorem is a different theorem in mathematical analysis.}} is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

More precisely, given a system of {{mvar|m}} equations {{math|1=fi{{space|hair}}(x1, ..., xn, y1, ..., ym) = 0, i = 1, ..., m}} (often abbreviated into {{math|1=F(x, y) = 0}}), the theorem states that, under a mild condition on the partial derivatives (with respect to each {{math|yi}} ) at a point, the {{mvar|m}} variables {{math|yi}} are differentiable functions of the {{math|xj}} in some neighborhood of the point. As these functions generally cannot be expressed in closed form, they are implicitly defined by the equations, and this motivated the name of the theorem.{{Cite book |last=Chiang |first=Alpha C. |author-link=Alpha Chiang |title=Fundamental Methods of Mathematical Economics |publisher=McGraw-Hill |edition=3rd |year=1984 |pages=[https://archive.org/details/fundamentalmetho0000chia_b4p1/page/204 204–206] |isbn=0-07-010813-7 |url=https://archive.org/details/fundamentalmetho0000chia_b4p1/page/204 }}

In other words, under a mild condition on the partial derivatives, the set of zeros of a system of equations is locally the graph of a function.

History

Augustin-Louis Cauchy (1789–1857) is credited with the first rigorous form of the implicit function theorem. Ulisse Dini (1845–1918) generalized the real-variable version of the implicit function theorem to the context of functions of any number of real variables.{{cite book |first1=Steven |last1=Krantz |first2=Harold |last2=Parks |title=The Implicit Function Theorem |series=Modern Birkhauser Classics |publisher=Birkhauser |year=2003 |isbn=0-8176-4285-4 |url=https://archive.org/details/implicitfunction0000kran |url-access=registration }}

Two variables case

Let f:\R^2 \to \R be a continuously differentiable function defining the implicit equation of a curve f(x,y) = 0 . Let (x_0, y_0) be a point on the curve, that is, a point such that f(x_0, y_0)=0. In this simple case, the implicit function theorem can be stated as follows:

{{math theorem|math_statement=If {{tmath|f(x,y)}} is a function that is continuously differentiable in a neighbourhood of the point {{tmath|(x_0,y_0)}}, and

\frac{\partial f}{ \partial y} (x_0, y_0) \neq 0, then there exists a unique differentiable function {{tmath|\varphi}} such that {{tmath|1=y_0=\varphi(x_0)}} and {{tmath|1=f(x, \varphi(x))=0}} in a neighbourhood of {{tmath|x_0}}.}}

Proof. By differentiating the equation {{tmath|1=f(x, \varphi(x))=0}}, one gets

\frac{\partial f}{ \partial x}(x, \varphi(x))+\varphi'(x)\, \frac{\partial f}{ \partial y}(x, \varphi(x))=0.

and thus

\varphi'(x)=-\frac{\frac{\partial f}{ \partial x}(x, \varphi(x))}{\frac{\partial f}{ \partial y}(x, \varphi(x))}.

This gives an ordinary differential equation for {{tmath|\varphi}}, with the initial condition {{tmath|1=\varphi(x_0) = y_0}}.

Since \frac{\partial f}{ \partial y} (x_0, y_0) \neq 0, the right-hand side of the differential equation is continuous. Hence, the Peano existence theorem applies so there is a (possibly non-unique) solution. To see why \varphi is unique, note that the function g_x(y)=f(x,y) is strictly monotone in a neighborhood of x_0,y_0 (as \frac{\partial f}{ \partial y} (x_0, y_0) \neq 0), thus it is injective. If \varphi,\phi are solutions to the differential equation, then g_x(\varphi(x))=g_x(\phi(x))=0 and by injectivity we get, \varphi(x)=\phi(x) .

First example

Image:Implicit circle.svg is the graph of some function of {{mvar|x}}, while around {{math|B}}, there is no function of {{mvar|x}} with the circle as its graph.
This is exactly what the implicit function theorem asserts in this case.]]

If we define the function {{math|1=f(x, y) = x2 + y2}}, then the equation {{math|1=f(x, y) = 1}} cuts out the unit circle as the level set {{math|1={(x, y) {{!}} f(x, y) = 1}}}. There is no way to represent the unit circle as the graph of a function of one variable {{math|1=y = g(x)}} because for each choice of {{math|x ∈ (−1, 1)}}, there are two choices of y, namely \pm\sqrt{1-x^2}.

However, it is possible to represent part of the circle as the graph of a function of one variable. If we let g_1(x) = \sqrt{1-x^2} for {{math|−1 ≤ x ≤ 1}}, then the graph of {{math|1=y = g1(x)}} provides the upper half of the circle. Similarly, if g_2(x) = -\sqrt{1-x^2}, then the graph of {{math|1=y = g2(x)}} gives the lower half of the circle.

The purpose of the implicit function theorem is to tell us that functions like {{math|g1(x)}} and {{math|g2(x)}} almost always exist, even in situations where we cannot write down explicit formulas. It guarantees that {{math|g1(x)}} and {{math|g2(x)}} are differentiable, and it even works in situations where we do not have a formula for {{math|f(x, y)}}.

Definitions

Let f: \R^{n+m} \to \R^m be a continuously differentiable function. We think of \R^{n+m} as the Cartesian product \R^n\times\R^m, and we write a point of this product as (\mathbf{x}, \mathbf{y}) = (x_1,\ldots, x_n, y_1, \ldots y_m). Starting from the given function f, our goal is to construct a function g: \R^n \to \R^m whose graph (\textbf{x}, g(\textbf{x})) is precisely the set of all (\textbf{x}, \textbf{y}) such that f(\textbf{x}, \textbf{y}) = \textbf{0}.

As noted above, this may not always be possible. We will therefore fix a point (\textbf{a}, \textbf{b}) = (a_1, \dots, a_n, b_1, \dots, b_m) which satisfies f(\textbf{a}, \textbf{b}) = \textbf{0}, and we will ask for a g that works near the point (\textbf{a}, \textbf{b}). In other words, we want an open set U \subset \R^n containing \textbf{a}, an open set V \subset \R^m containing \textbf{b}, and a function g : U \to V such that the graph of g satisfies the relation f = \textbf{0} on U\times V, and that no other points within U \times V do so. In symbols,

\{ (\mathbf{x}, g(\mathbf{x})) \mid \mathbf x \in U \} = \{ (\mathbf{x}, \mathbf{y})\in U \times V \mid f(\mathbf{x}, \mathbf{y}) = \mathbf{0} \}.

To state the implicit function theorem, we need the Jacobian matrix of f, which is the matrix of the partial derivatives of f. Abbreviating (a_1, \dots, a_n, b_1, \dots, b_m) to (\textbf{a}, \textbf{b}), the Jacobian matrix is

(Df)(\mathbf{a},\mathbf{b})

= \left[\begin{array}{ccc|ccc}

\frac{\partial f_1}{\partial x_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_1}{\partial x_n}(\mathbf{a},\mathbf{b}) &

\frac{\partial f_1}{\partial y_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_1}{\partial y_m}(\mathbf{a},\mathbf{b}) \\

\vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\

\frac{\partial f_m}{\partial x_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_m}{\partial x_n}(\mathbf{a},\mathbf{b}) &

\frac{\partial f_m}{\partial y_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_m}{\partial y_m}(\mathbf{a},\mathbf{b})

\end{array}\right]

= \left[\begin{array}{c|c} X & Y \end{array}\right]

where X is the matrix of partial derivatives in the variables x_i and Y is the matrix of partial derivatives in the variables y_j. The implicit function theorem says that if Y is an invertible matrix, then there are U, V, and g as desired. Writing all the hypotheses together gives the following statement.

Statement of the theorem

Let f: \R^{n+m} \to \R^m be a continuously differentiable function, and let \R^{n+m} have coordinates (\textbf{x}, \textbf{y}). Fix a point (\textbf{a}, \textbf{b}) = (a_1,\dots,a_n, b_1,\dots, b_m) with f(\textbf{a}, \textbf{b}) = \mathbf{0}, where \mathbf{0} \in \R^m is the zero vector. If the Jacobian matrix (this is the right-hand panel of the Jacobian matrix shown in the previous section):

J_{f, \mathbf{y}} (\mathbf{a}, \mathbf{b}) = \left [ \frac{\partial f_i}{\partial y_j} (\mathbf{a}, \mathbf{b}) \right ]

is invertible, then there exists an open set U \subset \R^n containing \textbf{a} such that there exists a unique function g: U \to \R^m such that {{nowrap| g(\mathbf{a}) = \mathbf{b},}} and {{nowrap| f(\mathbf{x}, g(\mathbf{x})) = \mathbf{0} ~ \text{for all} ~ \mathbf{x}\in U.}} Moreover, g is continuously differentiable and, denoting the left-hand panel of the Jacobian matrix shown in the previous section as:

J_{f, \mathbf{x}} (\mathbf{a}, \mathbf{b}) = \left [ \frac{\partial f_i}{\partial x_j} (\mathbf{a}, \mathbf{b}) \right ],

the Jacobian matrix of partial derivatives of g in U is given by the matrix product:{{Cite journal |first=Oswaldo |last=de Oliveira |title=The Implicit and Inverse Function Theorems: Easy Proofs |journal=Real Anal. Exchange |volume=39 |issue=1 |year=2013 |doi=10.14321/realanalexch.39.1.0207 |pages=214–216 |s2cid=118792515 |arxiv=1212.2066 }}

\left[\frac{\partial g_i}{\partial x_j} (\mathbf{x})\right]_{m\times n} =- \left [ J_{f, \mathbf{y}}(\mathbf{x}, g(\mathbf{x})) \right ]_{m \times m} ^{-1} \, \left [ J_{f, \mathbf{x}}(\mathbf{x}, g(\mathbf{x})) \right ]_{m \times n}

For a proof, see Inverse function theorem#Implicit_function_theorem. Here, the two-dimensional case is detailed.

=Higher derivatives=

If, moreover, f is analytic or continuously differentiable k times in a neighborhood of (\textbf{a}, \textbf{b}), then one may choose U in order that the same holds true for g inside U. {{Cite book |first1=K. | last1=Fritzsche |first2=H. |last2=Grauert |year=2002 |url=https://books.google.com/books?id=jSeRz36zXIMC&pg=PA34 |title=From Holomorphic Functions to Complex Manifolds |publisher=Springer |page=34 |isbn=9780387953953 }} In the analytic case, this is called the analytic implicit function theorem.

The circle example

Let us go back to the example of the unit circle. In this case n = m = 1 and f(x,y) = x^2 + y^2 - 1. The matrix of partial derivatives is just a 1 × 2 matrix, given by

(Df)(a,b) = \begin{bmatrix} \dfrac{\partial f}{\partial x}(a,b) & \dfrac{\partial f}{\partial y}(a,b) \end{bmatrix} = \begin{bmatrix} 2a & 2b \end{bmatrix}

Thus, here, the {{math|Y}} in the statement of the theorem is just the number {{math|2b}}; the linear map defined by it is invertible if and only if {{math|b ≠ 0}}. By the implicit function theorem we see that we can locally write the circle in the form {{math|1=y = g(x)}} for all points where {{math|y ≠ 0}}. For {{math|(±1, 0)}} we run into trouble, as noted before. The implicit function theorem may still be applied to these two points, by writing {{mvar|x}} as a function of {{mvar|y}}, that is, x = h(y); now the graph of the function will be \left(h(y), y\right), since where {{math|1=b = 0}} we have {{math|1=a = 1}}, and the conditions to locally express the function in this form are satisfied.

The implicit derivative of y with respect to x, and that of x with respect to y, can be found by totally differentiating the implicit function x^2+y^2-1 and equating to 0:

2x\, dx+2y\, dy = 0,

giving

\frac{dy}{dx}=-\frac{x}{y}

and

\frac{dx}{dy} = -\frac{y}{x}.

Application: change of coordinates

Suppose we have an {{mvar|m}}-dimensional space, parametrised by a set of coordinates (x_1,\ldots,x_m) . We can introduce a new coordinate system (x'_1,\ldots,x'_m) by supplying m functions h_1\ldots h_m each being continuously differentiable. These functions allow us to calculate the new coordinates (x'_1,\ldots,x'_m) of a point, given the point's old coordinates (x_1,\ldots,x_m) using x'_1=h_1(x_1,\ldots,x_m), \ldots, x'_m=h_m(x_1,\ldots,x_m) . One might want to verify if the opposite is possible: given coordinates (x'_1,\ldots,x'_m) , can we 'go back' and calculate the same point's original coordinates (x_1,\ldots,x_m) ? The implicit function theorem will provide an answer to this question. The (new and old) coordinates (x'_1,\ldots,x'_m, x_1,\ldots,x_m) are related by f = 0, with

f(x'_1,\ldots,x'_m,x_1,\ldots, x_m)=(h_1(x_1,\ldots, x_m)-x'_1,\ldots , h_m(x_1,\ldots, x_m)-x'_m).

Now the Jacobian matrix of f at a certain point (a, b) [ where a=(x'_1,\ldots,x'_m), b=(x_1,\ldots,x_m) ] is given by

(Df)(a,b) = \left [\begin{matrix}

-1 & \cdots & 0 \\

\vdots & \ddots & \vdots \\

0 & \cdots & -1

\end{matrix}\left|

\begin{matrix}

\frac{\partial h_1}{\partial x_1}(b) & \cdots & \frac{\partial h_1}{\partial x_m}(b)\\

\vdots & \ddots & \vdots\\

\frac{\partial h_m}{\partial x_1}(b) & \cdots & \frac{\partial h_m}{\partial x_m}(b)\\

\end{matrix} \right.\right] = [-I_m |J ].

where Im denotes the m × m identity matrix, and {{mvar|J}} is the {{math|m × m}} matrix of partial derivatives, evaluated at (a, b). (In the above, these blocks were denoted by X and Y. As it happens, in this particular application of the theorem, neither matrix depends on a.) The implicit function theorem now states that we can locally express (x_1,\ldots,x_m) as a function of (x'_1,\ldots,x'_m) if J is invertible. Demanding J is invertible is equivalent to det J ≠ 0, thus we see that we can go back from the primed to the unprimed coordinates if the determinant of the Jacobian J is non-zero. This statement is also known as the inverse function theorem.

= Example: polar coordinates =

As a simple application of the above, consider the plane, parametrised by polar coordinates {{math|(R, θ)}}. We can go to a new coordinate system (cartesian coordinates) by defining functions {{math|1=x(R, θ) = R cos(θ)}} and {{math|1=y(R, θ) = R sin(θ)}}. This makes it possible given any point {{math|(R, θ)}} to find corresponding Cartesian coordinates {{math|(x, y)}}. When can we go back and convert Cartesian into polar coordinates? By the previous example, it is sufficient to have {{math|1=det J ≠ 0}}, with

J =\begin{bmatrix}

\frac{\partial x(R,\theta)}{\partial R} & \frac{\partial x(R,\theta)}{\partial \theta} \\

\frac{\partial y(R,\theta)}{\partial R} & \frac{\partial y(R,\theta)}{\partial \theta} \\

\end{bmatrix}=

\begin{bmatrix}

\cos \theta & -R \sin \theta \\

\sin \theta & R \cos \theta

\end{bmatrix}.

Since {{math|1=det J = R}}, conversion back to polar coordinates is possible if {{math|1=R ≠ 0}}. So it remains to check the case {{math|1=R = 0}}. It is easy to see that in case {{math|1=R = 0}}, our coordinate transformation is not invertible: at the origin, the value of θ is not well-defined.

Generalizations

= Banach space version =

Based on the inverse function theorem in Banach spaces, it is possible to extend the implicit function theorem to Banach space valued mappings.{{Cite book |last=Lang |first=Serge |author-link=Serge Lang |title=Fundamentals of Differential Geometry |url=https://archive.org/details/fundamentalsdiff00lang_678 |url-access=limited |year=1999 |publisher=Springer | location=New York |series=Graduate Texts in Mathematics |isbn=0-387-98593-X |pages=[https://archive.org/details/fundamentalsdiff00lang_678/page/n15 15]–21 }}{{Cite book |last=Edwards |first=Charles Henry |title=Advanced Calculus of Several Variables |publisher=Dover Publications |location=Mineola, New York |year=1994 |orig-year=1973 |isbn=0-486-68336-2 |pages=417–418 }}

Let X, Y, Z be Banach spaces. Let the mapping {{math|f : X × YZ}} be continuously Fréchet differentiable. If (x_0,y_0)\in X\times Y, f(x_0,y_0)=0, and y\mapsto Df(x_0,y_0)(0,y) is a Banach space isomorphism from Y onto Z, then there exist neighbourhoods U of x0 and V of y0 and a Fréchet differentiable function g : UV such that f(x, g(x)) = 0 and f(x, y) = 0 if and only if y = g(x), for all (x,y)\in U\times V.

= Implicit functions from non-differentiable functions =

Various forms of the implicit function theorem exist for the case when the function f is not differentiable. It is standard that local strict monotonicity suffices in one dimension.{{springer |title=Implicit function |id=i/i050310 |last=Kudryavtsev |first=Lev Dmitrievich }} The following more general form was proven by Kumagai based on an observation by Jittorntrum.{{Cite journal |first=K. |last=Jittorntrum |title=An Implicit Function Theorem |journal=Journal of Optimization Theory and Applications |volume=25 |issue=4 |year=1978 |doi=10.1007/BF00933522 |pages=575–577 |s2cid=121647783 }}{{Cite journal |first=S. |last=Kumagai |title=An implicit function theorem: Comment |journal=Journal of Optimization Theory and Applications |volume=31 |issue=2 |year=1980 |doi=10.1007/BF00934117 |pages=285–288 |s2cid=119867925 }}

Consider a continuous function f : \R^n \times \R^m \to \R^n such that f(x_0, y_0) = 0. If there exist open neighbourhoods A \subset \R^n and B \subset \R^m of x0 and y0, respectively, such that, for all y in B, f(\cdot, y) : A \to \R^n is locally one-to-one, then there exist open neighbourhoods A_0 \subset \R^n and B_0 \subset \R^m of x0 and y0, such that, for all y \in B_0, the equation

f(x, y) = 0 has a unique solution

x = g(y) \in A_0,

where g is a continuous function from B0 into A0.

= Collapsing manifolds =

Perelman’s collapsing theorem for 3-manifolds, the capstone of his proof of Thurston's geometrization conjecture, can be understood as an extension of the implicit function theorem.{{cite journal |last1=Cao |first1=Jianguo |last2=Ge |first2=Jian |title=A simple proof of Perelman's collapsing theorem for 3-manifolds |journal=J. Geom. Anal. |date=2011 |volume=21 |issue=4 |pages=807–869|doi=10.1007/s12220-010-9169-5 |arxiv=1003.2215 |s2cid=514106 }}

See also

Notes

{{Notelist}}

References

{{reflist|30em}}

Further reading

  • {{cite book |first=Carl B. |last=Allendoerfer |author-link=Carl B. Allendoerfer |title=Calculus of Several Variables and Differentiable Manifolds |location=New York |publisher=Macmillan |year=1974 |chapter=Theorems about Differentiable Functions |pages=54–88 |isbn=0-02-301840-2 }}
  • {{cite book |first=K. G. |last=Binmore |author-link=Kenneth Binmore |chapter=Implicit Functions |title=Calculus |location=New York |publisher=Cambridge University Press |year=1983 |isbn=0-521-28952-1 |pages=198–211 |chapter-url=https://books.google.com/books?id=K8RfQgAACAAJ&pg=PA198 }}
  • {{cite book |first1=Lynn H. |last1=Loomis |author-link=Lynn Harold Loomis |first2=Shlomo |last2=Sternberg |author-link2=Shlomo Sternberg |title=Advanced Calculus |url=https://archive.org/details/advancedcalculus0000loom |url-access=registration |location=Boston |publisher=Jones and Bartlett |edition=Revised |year=1990 |pages=[https://archive.org/details/advancedcalculus0000loom/page/164 164–171] |isbn=0-86720-122-3 }}
  • {{cite book |first1=Murray H. |last1=Protter |author-link=Murray H. Protter |first2=Charles B. Jr. |last2=Morrey |author-link2=Charles B. Morrey Jr. |chapter=Implicit Function Theorems. Jacobians |title=Intermediate Calculus |location=New York |publisher=Springer |edition=2nd |year=1985 |isbn=0-387-96058-9 |pages=390–420 |chapter-url=https://books.google.com/books?id=3lTmBwAAQBAJ&pg=PA390 }}

{{DEFAULTSORT:Implicit Function Theorem}}

Category:Articles containing proofs

Category:Mathematical identities

Category:Theorems in calculus

Category:Theorems in real analysis