polarization identity#Symmetric bilinear forms

{{short description|Formula relating the norm and the inner product in a inner product space}}

{{About|quadratic forms|formulas for higher-degree polynomials|Polarization of an algebraic form}}

File:Parallelogram law.svg

In linear algebra, a branch of mathematics, the polarization identity is any one of a family of formulas that express the inner product of two vectors in terms of the norm of a normed vector space.

If a norm arises from an inner product then the polarization identity can be used to express this inner product entirely in terms of the norm. The polarization identity shows that a norm can arise from at most one inner product; however, there exist norms that do not arise from any inner product.

The norm associated with any inner product space satisfies the parallelogram law: \|x+y\|^2 + \|x-y\|^2 = 2\|x\|^2 + 2\|y\|^2.

In fact, as observed by John von Neumann,{{sfn|Lax|2002|p=53}} the parallelogram law characterizes those norms that arise from inner products.

Given a normed space (H, \|\cdot\|), the parallelogram law holds for \|\cdot\| if and only if there exists an inner product \langle \cdot, \cdot \rangle on H such that \|x\|^2 = \langle x,\ x\rangle for all x \in H, in which case this inner product is uniquely determined by the norm via the polarization identity.{{cite book|author=Philippe Blanchard, Erwin Brüning|chapter=Proposition 14.1.2 (Fréchet–von Neumann–Jordan)|chapter-url=https://books.google.com/books?id=1g2rikccHcgC&pg=PA192|page=192|title=Mathematical methods in physics: distributions, Hilbert space operators, and variational methods|year=2003|publisher=Birkhäuser|isbn=0817642285}}{{cite book|author=Gerald Teschl|title=Mathematical methods in quantum mechanics: with applications to Schrödinger operators|chapter=Theorem 0.19 (Jordan–von Neumann)|page=19|url=https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/|isbn=978-0-8218-4660-5|year=2009|publisher=American Mathematical Society Bookstore}}

Polarization identities

Any inner product on a vector space induces a norm by the equation

\|x\| = \sqrt{\langle x, x \rangle}.

The polarization identities reverse this relationship, recovering the inner product from the norm.

Every inner product satisfies:

\|x + y\|^2 = \|x\|^2 + \|y\|^2 + 2\operatorname{Re}\langle x, y \rangle \qquad \text{ for all vectors } x, y.

Solving for \operatorname{Re}\langle x, y \rangle gives the formula \operatorname{Re}\langle x, y \rangle = \frac{1}{2} \left(\|x+y\|^2 - \|x\|^2 - \|y\|^2\right). If the inner product is real then \operatorname{Re}\langle x, y \rangle = \langle x, y \rangle and this formula becomes a polarization identity for real inner products.

= Real vector spaces =

If the vector space is over the real numbers then the polarization identities are:{{sfn|Schechter|1996|pp=601-603}}

\begin{alignat}{4}

\langle x, y \rangle

&= \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right) \\[3pt]

&= \frac{1}{2} \left(\|x+y\|^2 - \|x\|^2 - \|y\|^2\right) \\[3pt]

&= \frac{1}{2} \left(\|x\|^2 + \|y\|^2 - \|x-y\|^2\right). \\[3pt]

\end{alignat}

These various forms are all equivalent by the parallelogram law:

2\|x\|^2 + 2\|y\|^2 = \|x+y\|^2 + \|x-y\|^2.

This further implies that L^p class is not a Hilbert space whenever {{tmath|1= p\neq 2 }}, as the parallelogram law is not satisfied. For the sake of counterexample, consider x=1_A and y=1_B for any two disjoint subsets A,B of general domain \Omega\subset\mathbb{R}^n and compute the measure of both sets under parallelogram law.

= Complex vector spaces =

For vector spaces over the complex numbers, the above formulas are not quite correct because they do not describe the imaginary part of the (complex) inner product.

However, an analogous expression does ensure that both real and imaginary parts are retained.

The complex part of the inner product depends on whether it is antilinear in the first or the second argument.

The notation \langle x | y \rangle, which is commonly used in physics will be assumed to be antilinear in the {{em|first}} argument while \langle x,\, y \rangle, which is commonly used in mathematics, will be assumed to be antilinear in its {{em|second}} argument.

They are related by the formula:

\langle x,\, y \rangle = \langle y \,|\, x \rangle \quad \text{ for all } x, y \in H.

The real part of any inner product (no matter which argument is antilinear and no matter if it is real or complex) is a symmetric bilinear map that for any x, y \in H is always equal to:{{sfn|Schechter|1996|pp=601-603}}

\begin{alignat}{4}

R(x, y)

:&= \operatorname{Re} \langle x \mid y \rangle = \operatorname{Re} \langle x, y \rangle \\

&= \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right) \\

&= \frac{1}{2} \left(\|x+y\|^2 - \|x\|^2 - \|y\|^2\right) \\[3pt]

&= \frac{1}{2} \left(\|x\|^2 + \|y\|^2 - \|x-y\|^2\right). \\[3pt]

\end{alignat}

It is always a symmetric map, meaning that

R(x, y) = R(y, x) \quad \text{ for all } x, y \in H,

and it also satisfies:

R(ix, y) = - R(x, iy) \quad \text{ for all } x, y \in H,

which in plain English says that to move a factor of i to the other argument, introduce a negative sign. These properties can be proven either from the properties of inner products directly or from properties of norms by using the polarization identity.

{{collapse top|title={{anchor|Proof of formulas and equivalent forms}}Proof of properties of R using the polarization identity|left=true}}

Let

R(x, y) := \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right).

Then R(y, x)=\frac{1}{4} \left(\|y+x\|^2 - \|y-x\|^2\right)=\frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right)=R(x, y),

which proves that {{tmath|1= R(x, y) = R(y, x) }}.

Additionally,

R(ix, y)=\frac{1}{4} \left(\|ix+y\|^2 - \|ix-y\|^2\right)

=\frac{1}{4} \left(\|x+(1/i)y\|^2 - \|x-(1/i)y\|^2\right)

=\frac{1}{4} \left(\|x-iy\|^2 - \|x+iy\|^2\right)

=-\frac{1}{4} \left( \|x+iy\|^2 - \|x-iy\|^2\right)

=-R(x, iy),

which proves that R(ix, y) = - R(x, iy).

\blacksquare

{{collapse bottom}}

Unlike its real part, the imaginary part of a complex inner product depends on which argument is antilinear.

Antilinear in first argument

The polarization identities for the inner product \langle x \,|\, y \rangle, which is antilinear in the {{em|first}} argument, are

:\begin{alignat}{4}

\langle x \,|\, y \rangle

&= \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2 - i\|x + iy\|^2 + i\|x - iy\|^2\right) \\

&= \frac{1}{4} \sum_{k=0}^3 i^k\|x+(-i)^ky\|^2 \\

&= R(x, y) - i R(x, iy) \\

&= R(x, y) + i R(ix, y) \\

\end{alignat}

where x, y \in H.

The second to last equality is similar to the formula expressing a linear functional \varphi in terms of its real part: \varphi(y) = \operatorname{Re} \varphi(y) - i (\operatorname{Re} \varphi)(i y).

Antilinear in second argument

The polarization identities for the inner product \langle x, \ y \rangle, which is antilinear in the {{em|second}} argument, follows from that of \langle x \,|\, y \rangle by the relationship:

\langle x, \ y \rangle := \langle y \,|\, x \rangle = \overline{\langle x \,|\, y \rangle} \quad \text{ for all } x, y \in H.

So for any x, y \in H,{{sfn|Schechter|1996|pp=601-603}}

: \begin{alignat}{4}

\langle x,\, y \rangle

&= \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2 + i\|x + iy\|^2 - i\|x - iy\|^2\right) \\

&= R(x, y) + i R(x, iy) \\

&= R(x, y) - i R(ix, y). \\

\end{alignat}

This expression can be phrased symmetrically as:{{Cite web|last=Butler|first=Jon|date=20 June 2013|title=norm - Derivation of the polarization identities?|url=https://math.stackexchange.com/questions/425173/derivation-of-the-polarization-identities|url-status=live|archive-url=https://archive.today/20201014185358/https://math.stackexchange.com/questions/425173/derivation-of-the-polarization-identities|archive-date=14 October 2020|access-date=2020-10-14|website=Mathematics Stack Exchange}} See Harald Hanche-Olson's answer.

\langle x, y \rangle = \frac{1}{4} \sum_{k=0}^3 i^k \left\|x + i^k y\right\|^2.

Summary of both cases

Thus if R(x, y) + i I(x, y) denotes the real and imaginary parts of some inner product's value at the point (x, y) \in H \times H of its domain, then its imaginary part will be:

I(x, y) ~=~

\begin{cases}

~R({\color{red}i} x, y) & \qquad \text{ if antilinear in the } {\color{red}1} \text{st argument} \\

~R(x, {\color{blue}i} y) & \qquad \text{ if antilinear in the } {\color{blue}2} \text{nd argument} \\

\end{cases}

where the scalar i is always located in the same argument that the inner product is antilinear in.

Using {{tmath|1= R(ix, y) = - R(x, iy) }}, the above formula for the imaginary part becomes:

I(x, y) ~=~

\begin{cases}

-R(x, {\color{black}i} y) & \qquad \text{ if antilinear in the } {\color{black}1} \text{st argument} \\

-R({\color{black}i} x, y) & \qquad \text{ if antilinear in the } {\color{black}2} \text{nd argument} \\

\end{cases}

Reconstructing the inner product

In a normed space (H, \|\cdot\|), if the parallelogram law

\|x+y\|^2 ~+~ \|x-y\|^2 ~=~ 2\|x\|^2+2\|y\|^2

holds, then there exists a unique inner product \langle \cdot,\ \cdot\rangle on H such that \|x\|^2 = \langle x,\ x\rangle for all x \in H.{{sfn|Schechter|1996|pp=601-603}}{{sfn|Lax|2002|p=53}}

{{math proof|proof=

We will only give the real case here; the proof for complex vector spaces is analogous.

By the above formulas, if the norm is described by an inner product (as we hope), then it must satisfy

\langle x, \ y \rangle = \frac{1}{4} \left(\|x+y\|^2 - \|x-y\|^2\right) \quad \text{ for all } x, y \in H,

which may serve as a definition of the unique candidate \langle \cdot, \cdot \rangle for the role of a suitable inner product. Thus, the uniqueness is guaranteed.

It remains to prove that this formula indeed defines an inner product and that this inner product induces the norm \|\cdot\|.

Explicitly, the following will be shown:

  1. \langle x, x \rangle = \|x\|^2, \quad x \in H
  2. \langle x, y \rangle = \langle y, x \rangle, \quad x, y \in H
  3. \langle x+z, y\rangle = \langle x, y\rangle + \langle z, y\rangle \quad \text{ for all } x, y, z \in H,
  4. \langle \alpha x, y \rangle = \alpha\langle x, y \rangle \quad \text{ for all } x, y \in H \text{ and all } \alpha \in \R

(This axiomatization omits positivity, which is implied by (1) and the fact that \|\cdot\| is a norm.)

For properties (1) and (2), substitute: \langle x, x \rangle = \frac{1}{4} \left(\|x+x\|^2 - \|x-x\|^2\right) = \|x\|^2, and \|x-y\|^2 = \|y-x\|^2.

For property (3), it is convenient to work in reverse.

It remains to show that

\|x+z+y\|^2 - \|x+z-y\|^2 \overset{?}{=} \|x+y\|^2 - \|x-y\|^2 + \|z+y\|^2 - \|z-y\|^2

or equivalently,

2\left(\|x+z+y\|^2 + \|x-y\|^2\right) - 2\left(\|x+z-y\|^2 + \|x+y\|^2\right) \overset{?}{=} 2\|z+y\|^2 - 2\|z-y\|^2.

Now apply the parallelogram identity:

2\|x+z+y\|^2 + 2\|x-y\|^2 = \|2x+z\|^2 + \|2y+z\|^2

2\|x+z-y\|^2 + 2\|x+y\|^2 = \|2x+z\|^2 + \|z-2y\|^2

Thus it remains to verify:

\cancel{\|2x+z\|^2} + \|2y+z\|^2 - (\cancel{\|2x+z\|^2} + \|z-2y\|^2) \overset{?}{{}={}} 2\|z+y\|^2 - 2\|z-y\|^2

\|2y+z\|^2 - \|z-2y\|^2 \overset{?}{=} 2\|z+y\|^2 - 2\|z-y\|^2

But the latter claim can be verified by subtracting the following two further applications of the parallelogram identity:

\|2y+z\|^2 + \|z\|^2 = 2\|z+y\|^2 + 2\|y\|^2

\|z-2y\|^2 + \|z\|^2 = 2\|z-y\|^2 + 2\|y\|^2

Thus (3) holds.

It can be verified by induction that (3) implies (4), as long as \alpha \in \Z.

But "(4) when \alpha \in \Z" implies "(4) when \alpha \in \Q".

And any positive-definite, real-valued, \Q-bilinear form satisfies the Cauchy–Schwarz inequality, so that \langle \sdot,\sdot \rangle is continuous.

Thus \langle \sdot,\sdot \rangle must be \R-linear as well.

}}

Another necessary and sufficient condition for there to exist an inner product that induces a given norm \|\cdot\| is for the norm to satisfy Ptolemy's inequality, which is:{{Cite journal|last=Apostol|first=Tom M.|date=1967|title=Ptolemy's Inequality and the Chordal Metric|url=https://www.tandfonline.com/doi/pdf/10.1080/0025570X.1967.11975804|journal=Mathematics Magazine|volume=40|issue=5|pages=233–235| language=en| doi=10.2307/2688275|jstor=2688275}}

\|x - y\| \, \|z\| ~+~ \|y - z\| \, \|x\| ~\geq~ \|x - z\| \, \|y\| \qquad \text{ for all vectors } x, y, z.

Applications and consequences

If H is a complex Hilbert space then \langle x \mid y \rangle is real if and only if its imaginary part is {{tmath|1= 0 = R(x, iy) = \frac{1}{4} \left(\Vert x+iy \Vert^2 - \Vert x-iy \Vert^2\right) }}, which happens if and only if {{tmath|1= \Vert x+iy \Vert = \Vert x-iy \Vert }}.

Similarly, \langle x \mid y \rangle is (purely) imaginary if and only if {{tmath|1= \Vert x+y \Vert = \Vert x-y \Vert }}.

For example, from \|x+ix\| = |1+i| \|x\| = \sqrt{2} \|x\| = |1-i| \|x\| = \|x-ix\| it can be concluded that \langle x | x \rangle is real and that \langle x | ix \rangle is purely imaginary.

= Isometries =

If A : H \to Z is a linear isometry between two Hilbert spaces (so \|A h\| = \|h\| for all h \in H) then

\langle A h, A k \rangle_Z = \langle h, k \rangle_H \quad \text{ for all } h, k \in H;

that is, linear isometries preserve inner products.

If A : H \to Z is instead an antilinear isometry then

\langle A h, A k \rangle_Z = \overline{\langle h, k \rangle_H} = \langle k, h \rangle_H \quad \text{ for all } h, k \in H.

= Relation to the law of cosines =

The second form of the polarization identity can be written as

\|\textbf{u}-\textbf{v}\|^2 = \|\textbf{u}\|^2 + \|\textbf{v}\|^2 - 2(\textbf{u} \cdot \textbf{v}).

This is essentially a vector form of the law of cosines for the triangle formed by the vectors {{tmath|1= \textbf{u} }}, {{tmath|1= \textbf{v} }}, and {{tmath|1= \textbf{u}-\textbf{v} }}.

In particular,

\textbf{u}\cdot\textbf{v} = \|\textbf{u}\|\,\|\textbf{v}\| \cos\theta,

where \theta is the angle between the vectors \textbf{u} and {{tmath|1= \textbf{v} }}.

The equation is numerically unstable if u and v are similar because of catastrophic cancellation and should be avoided for numeric computation.

= Derivation =

The basic relation between the norm and the dot product is given by the equation

\|\textbf{v}\|^2 = \textbf{v} \cdot \textbf{v}.

Then

\begin{align}

\|\textbf{u} + \textbf{v}\|^2

&= (\textbf{u} + \textbf{v}) \cdot (\textbf{u} + \textbf{v}) \\[3pt]

&= (\textbf{u} \cdot \textbf{u}) + (\textbf{u} \cdot \textbf{v}) + (\textbf{v} \cdot \textbf{u}) + (\textbf{v} \cdot \textbf{v}) \\[3pt]

&= \|\textbf{u}\|^2 + \|\textbf{v}\|^2 + 2(\textbf{u} \cdot \textbf{v}),

\end{align}

and similarly

\|\textbf{u} - \textbf{v}\|^2 = \|\textbf{u}\|^2 + \|\textbf{v}\|^2 - 2(\textbf{u} \cdot \textbf{v}).

Forms (1) and (2) of the polarization identity now follow by solving these equations for {{tmath|1= \textbf{u} \cdot \textbf{v} }}, while form (3) follows from subtracting these two equations.

(Adding these two equations together gives the parallelogram law.)

Generalizations

= Jordan–von Neumann theorems =

The standard Jordanvon Neumann theorem, as stated previously, is that the if a norm satisfies the parallelogram law, then it can be induced by an inner product defined by the polarization identity. There are variants of the theorem.{{Cite journal |last=Day |first=Mahlon M. |date=1947 |title=Some Characterizations of Inner-Product Spaces |url=https://www.jstor.org/stable/1990458 |journal=Transactions of the American Mathematical Society |volume=62 |issue=2 |pages=320–337 |doi=10.2307/1990458 |issn=0002-9947}}

Define various senses of orthogonality:

  • isosceles: \|x+y \| =\|x-y \|
  • Roberts’: \left\|x+ty\right\|=\left\|x-ty\right\| for all scalar t.
  • Pythagorean: \left\|x+y\right\|^2=\|x\|^2+\left\|y\right\|^2
  • Birkhoff–James: \|x\| \leq \|x + ty \| for all scalar t.

Let V be a vector space over the real or complex numbers. Let \|\cdot\| be a norm over V. We consider conditions for which the norm is induced by an inner product. In the following statements, whenever a scalar appears, the scalar may be restricted to be merely real, even when V is over the complex numbers.

  • (von Neumann–Jordan condition) The norm satisfies the parallelogram identity.
  • (weakened von Neumann–Jordan condition) \|x + y\|^2 + \|x - y\|^2 = 4 for all unit vectors x,y. That is, the norm satisfies the parallelogram identity for unit vectors.
  • For any x, y \in V, the set of points equidistant to x, y is flat, that is, an affine subspace.
  • Orthogonality in either isosceles or Roberts’ sense is either additive or homogeneous on one variable.
  • For every two-dimensional subspace W \subset V, for every x \in W, there exists y \in W that is Roberts’ orthogonal to x.
  • Isosceles orthogonality implies Pythagorean orthogonality.
  • Pythagorean orthogonality implies isosceles orthogonality.
  • If x, y are Pythagorean orthogonal, then so are x, -y.
  • Birkhoff–James orthogonality is symmetric.
  • If \|x\|=\|y\| and t, s are real, then \|t x+s y\|=\|s x+t y\|.

For the real vector space, there is also the condition:

  • Any two-dimensional slice of the unit sphere is an ellipse, that is, parameterizable as \{x \cos\theta + y \sin\theta : \theta \in [0, 2\pi]\}, for some unit vectors x, y.

{{Math proof|title=Proof|proof=Since the internal John ellipse E is unique, for any bijective linear map T that preserves the unit circle, it must have TE = E. Since E must touch the circle at some point x, we may map x to any other point y on the circle, thus every point y touches the ellipse E. Thus the circle is the ellipse.}}

The Banach-Mazur rotation problem: Given a separable Banach space V such that for any two unit vectors x, y, there exists a linear surjective isometry T such that T(x) = y or T(y) = x, is V isometrically isomorphic to a Hilbert space?

The general case of the problem is open. When the space is parable finite-dimensional, the answer is yes. In other words, given a finite-dimensional normed vector space over the real or complex numbers, if any point on the unit sphere can be mapped (rotated) to any other point by a linear isometry, then the norm is induced by an inner product.{{cite journal

| last1 = Becerra Guerrero | first1 = Julio

| last2 = Rodríguez-Palacios | first2 = A.

| hdl = 10662/18957

| issue = 1

| journal = Extracta Mathematicae

| mr = 1914238

| pages = 1–58

| title = Transitivity of the norm on Banach spaces

| volume = 17

| year = 2002}}

= Symmetric bilinear forms =

The polarization identities are not restricted to inner products.

If B is any symmetric bilinear form on a vector space, and Q is the quadratic form defined by

Q(v) = B(v, v),

then

\begin{align}

2 B(u, v) &= Q(u + v) - Q(u) - Q(v), \\

2 B(u, v) &= Q(u) + Q(v) - Q(u - v), \\

4 B(u, v) &= Q(u + v) - Q(u - v).

\end{align}

The so-called symmetrization map generalizes the latter formula, replacing Q by a homogeneous polynomial of degree k defined by Q(v) = B(v, \ldots, v), where B is a symmetric k-linear map.{{harvnb|Butler|2013}}. See Keith Conrad (KCd)'s answer.

The formulas above even apply in the case where the field of scalars has characteristic two, though the left-hand sides are all zero in this case.

Consequently, in characteristic two there is no formula for a symmetric bilinear form in terms of a quadratic form, and they are in fact distinct notions, a fact which has important consequences in L-theory; for brevity, in this context "symmetric bilinear forms" are often referred to as "symmetric forms".

These formulas also apply to bilinear forms on modules over a commutative ring, though again one can only solve for B(u, v) if 2 is invertible in the ring, and otherwise these are distinct notions. For example, over the integers, one distinguishes integral quadratic forms from integral {{em|symmetric}} forms, which are a narrower notion.

More generally, in the presence of a ring involution or where 2 is not invertible, one distinguishes \varepsilon-quadratic forms and \varepsilon-symmetric forms; a symmetric form defines a quadratic form, and the polarization identity (without a factor of 2) from a quadratic form to a symmetric form is called the "symmetrization map", and is not in general an isomorphism. This has historically been a subtle distinction: over the integers it was not until the 1950s that relation between "twos out" (integral {{em|quadratic}} form) and "twos in" (integral {{em|symmetric}} form) was understood – see discussion at integral quadratic form; and in the algebraization of surgery theory, Mishchenko originally used {{em|symmetric}} L-groups, rather than the correct {{em|quadratic}} L-groups (as in Wall and Ranicki) – see discussion at L-theory.

= Homogeneous polynomials of higher degree =

Finally, in any of these contexts these identities may be extended to homogeneous polynomials (that is, algebraic forms) of arbitrary degree, where it is known as the polarization formula, and is reviewed in greater detail in the article on the polarization of an algebraic form.

See also

  • {{annotated link|Inner product space}}
  • {{annotated link|Law of cosines}}
  • {{annotated link|Mazur–Ulam theorem}}
  • {{annotated link|Minkowski distance}}
  • {{annotated link|Parallelogram law}}
  • {{annotated link|Ptolemy's inequality}}

Notes and references

{{reflist}}

{{reflist|group=note}}

{{reflist|group=proof|refs=

A proof can be found here.}}

Bibliography

{{sfn whitelist|CITEREFLax2002|CITEREFRudin1991|CITEREFSchechter1996}}

  • {{Lax Functional Analysis}}
  • {{Rudin Walter Functional Analysis|edition=2}}
  • {{Schechter Handbook of Analysis and Its Foundations}}

{{Hilbert space}}

{{Lp spaces}}

{{Banach spaces}}

{{Functional Analysis}}

Category:Abstract algebra

Category:Linear algebra

Category:Functional analysis

Category:Vectors (mathematics and physics)

Category:Norms (mathematics)

Category:Algebraic identities