Maxwell's theorem

{{Short description|Concept in probability theory}}

{{hatnote|See Maxwell's theorem (geometry) for the result on triangles.}}

In probability theory, Maxwell's theorem (known also as Herschel-Maxwell's theorem and Herschel-Maxwell's derivation) states that if the probability distribution of a random vector in \R^n is unchanged by rotations, and if the components are independent, then the components are identically distributed and normally distributed.

Equivalent statements

If the probability distribution of a vector-valued random variable X = ( X1, ..., Xn )T is the same as the distribution of GX for every n×n orthogonal matrix G and the components are independent, then the components X1, ..., Xn are normally distributed with expected value 0 and all have the same variance. This theorem is one of many characterizations of the normal distribution.

The only rotationally invariant probability distributions on Rn that have independent components are multivariate normal distributions with expected value 0 and variance σ2In, (where In = the n×n identity matrix), for some positive number σ2.

History

John Herschel proved the theorem in 1850.Herschel, J. F. W. (1850). [https://www.google.com/books/edition/Essays_from_the_Edinburgh_and_Quarterly/S48UtJaeE-8C?hl=en&gbpv=1&dq=%22quetelet%20on%20probabilities%22&pg=PA365&printsec=frontcover Review of Quetelet on probabilities]. Edinburgh Rev., 92, 1–57.{{harvtxt|Bryc|1995|p=1}} quotes Herschel and "state[s] the Herschel-Maxwell theorem in modern notation but without proof". Bryc cites M. S. Bartlett (1934) "for one of the early proofs" and lists several variants of the theorem that are proven in his book. Ten years later, James Clerk Maxwell proved the theorem in Proposition IV of his 1860 paper.See:

  • Maxwell, J.C. (1860) [https://books.google.com/books?id=-YU7AQAAMAAJ&pg=PA19 "Illustrations of the dynamical theory of gases. Part I. On the motions and collisions of perfectly elastic spheres,"] Philosophical Magazine, 4th series, 19 : 19–32.
  • Maxwell, J.C. (1860) [https://books.google.com/books?id=DIc7AQAAMAAJ&pg=PA21 "Illustrations of the dynamical theory of gases. Part II. On the process of diffusion of two or more kinds of moving particles among one another,"] Philosophical Magazine, 4th series, 20 : 21–37.{{Cite journal |last=Gyenis |first=Balázs |date=February 2017 |title=Maxwell and the normal distribution: A colored story of probability, independence, and tendency toward equilibrium |url=http://dx.doi.org/10.1016/j.shpsb.2017.01.001 |journal=Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics |volume=57 |pages=53–65 |doi=10.1016/j.shpsb.2017.01.001 |arxiv=1702.01411 |bibcode=2017SHPMP..57...53G |s2cid=38272381 |issn=1355-2198}}

Proof

We only need to prove the theorem for the 2-dimensional case, since we can then generalize the theorem to n-dimensions by sequentially applying the theorem for 2-dimensions to each pair of coordinates.

Since rotating by 90 degrees preserves the joint distribution, X_1 and X_2 have the same probability measure: let it be \mu. If \mu is a Dirac delta distribution at zero, then it is, in particular, a degenerate gaussian distribution. Let us now assume that \mu is not a Dirac delta distribution at zero.

By Lebesgue's decomposition theorem, we can decompose \mu into a sum of a regular measure and an atomic measure: \mu = \mu_r + \mu_s. We need to show that \mu_s = 0; we proceed by contradiction. Suppose \mu_s contains an atomic part, then there exists some x\in \R such that \mu_s(\{x\}) > 0. By independence of X_1, X_2, the conditional variable X_2 | \{X_1 = x\} is distributed the same way as X_2. Suppose x=0, then since we assumed \mu is not concentrated at zero, Pr(X_2 \neq 0) > 0, and so the double ray \{(x_1, x_2): x_1 = 0, x_2 \neq 0\} has nonzero probability. Now, by the rotational symmetry of \mu \times \mu, any rotation of the double ray also has the same nonzero probability, and since any two rotations are disjoint, their union has infinite probability, which is a contradiction.{{clarify|reason=State the obvious by explicitly describing the contradiction.|date=June 2025}}

Let \mu have probability density function \rho; the problem reduces to solving the functional equation

\rho(x)\rho(y) = \rho(x \cos \theta + y \sin\theta)\rho(x \sin \theta - y \cos\theta).

References

{{reflist}}

Sources

  • {{cite book |title=The Normal Distribution: Characterizations with Applications|last=Bryc|first=Wlodzimierz|publisher=Springer-Verlag|year=1995|isbn=978-0-387-97990-8|url=https://www.google.com/books/edition/The_Normal_Distribution/tyXjBwAAQBAJ?hl=en&gbpv=0}}
  • {{cite book|last=Feller|first=William| authorlink=William Feller| date=1966 |title= An Introduction to Probability Theory and its Applications| volume=II| edition=1st| publisher=Wiley| page=187}}
  • {{cite journal|last=Maxwell|first=James Clerk|authorlink=James Clerk Maxwell|date=1860 |title=Illustrations of the dynamical theory of gases| journal=Philosophical Magazine |series=4th Series| volume=19| pages=390–393}}