Disintegration theorem

{{Short description|Theorem in measure theory}}

{{Use dmy dates|date=July 2020}}

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

Motivation

Consider the unit square S = [0,1]\times[0,1] in the Euclidean plane \mathbb{R}^2. Consider the probability measure \mu defined on S by the restriction of two-dimensional Lebesgue measure \lambda^2 to S. That is, the probability of an event E\subseteq S is simply the area of E. We assume E is a measurable subset of S.

Consider a one-dimensional subset of S such as the line segment L_x = \{x\}\times[0, 1]. L_x has \mu-measure zero; every subset of L_x is a \mu-null set; since the Lebesgue measure space is a complete measure space,

E \subseteq L_{x} \implies \mu (E) = 0.

While true, this is somewhat unsatisfying. It would be nice to say that \mu "restricted to" L_x is the one-dimensional Lebesgue measure \lambda^1, rather than the zero measure. The probability of a "two-dimensional" event E could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" E\cap L_x: more formally, if \mu_x denotes one-dimensional Lebesgue measure on L_x, then

\mu (E) = \int_{[0, 1]} \mu_{x} (E \cap L_{x}) \, \mathrm{d} x

for any "nice" E\subseteq S. The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

Statement of the theorem

(Hereafter, \mathcal{P}(X) will denote the collection of Borel probability measures on a topological space (X, T).)

The assumptions of the theorem are as follows:

  • Let Y and X be two Radon spaces (i.e. a topological space such that every Borel probability measure on it is inner regular, e.g. separably metrizable spaces; in particular, every probability measure on it is outright a Radon measure).
  • Let \mu\in\mathcal{P}(Y).
  • Let \pi : Y\to X be a Borel-measurable function. Here one should think of \pi as a function to "disintegrate" Y, in the sense of partitioning Y into \{ \pi^{-1}(x)\ |\ x \in X\}. For example, for the motivating example above, one can define \pi((a,b)) = a, (a,b) \in [0,1]\times [0,1], which gives that \pi^{-1}(a) = a \times [0,1], a slice we want to capture.
  • Let \nu \in\mathcal{P}(X) be the pushforward measure \nu = \pi_{*}(\mu) = \mu \circ \pi^{-1}. This measure provides the distribution of x (which corresponds to the events \pi^{-1}(x)).

The conclusion of the theorem: There exists a \nu-almost everywhere uniquely determined family of probability measures \{\mu_x\}_{x\in X} \subseteq \mathcal{P}(Y), which provides a "disintegration" of \mu into {{nowrap|\{\mu_x\}_{x \in X},}} such that:

  • the function x \mapsto \mu_{x} is Borel measurable, in the sense that x \mapsto \mu_{x} (B) is a Borel-measurable function for each Borel-measurable set B\subseteq Y;
  • \mu_x "lives on" the fiber \pi^{-1}(x): for \nu-almost all x\in X, \mu_{x} \left( Y \setminus \pi^{-1} (x) \right) = 0, and so \mu_x(E) =\mu_x(E\cap\pi^{-1}(x));
  • for every Borel-measurable function f : Y \to [0,\infty], \int_{Y} f(y) \, \mathrm{d} \mu (y) = \int_{X} \int_{\pi^{-1} (x)} f(y) \, \mathrm{d} \mu_x (y) \, \mathrm{d} \nu (x). In particular, for any event E\subseteq Y, taking f to be the indicator function of E,{{cite book |author1=Dellacherie, C. |author2=Meyer, P.-A. |title=Probabilities and Potential |series=North-Holland Mathematics Studies |publisher=North-Holland |location=Amsterdam |year=1978 |isbn=0-7204-0701-X }} \mu (E) = \int_X \mu_x (E) \, \mathrm{d} \nu (x).

Applications

=Product spaces=

{{More citations needed section|date=May 2022}}

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When Y is written as a Cartesian product Y = X_1\times X_2 and \pi_i : Y\to X_i is the natural projection, then each fibre \pi_1^{-1}(x_1) can be canonically identified with X_2 and there exists a Borel family of probability measures \{ \mu_{x_{1}} \}_{x_{1} \in X_{1}} in \mathcal{P}(X_2) (which is (\pi_1)_*(\mu)-almost everywhere uniquely determined) such that

\mu = \int_{X_{1}} \mu_{x_{1}} \, \mu \left(\pi_1^{-1}(\mathrm d x_1) \right)= \int_{X_{1}} \mu_{x_{1}} \, \mathrm{d} (\pi_{1})_{*} (\mu) (x_{1}),

which is in particular{{Clarify|date=May 2022|reason=Notation "\mu(d x_2[pipe]x_1)" has not been defined}}

\int_{X_1\times X_2} f(x_1,x_2)\, \mu(\mathrm d x_1,\mathrm d x_2) = \int_{X_1}\left( \int_{X_2} f(x_1,x_2) \mu(\mathrm d x_2\mid x_1) \right) \mu\left( \pi_1^{-1}(\mathrm{d} x_{1})\right)

and

\mu(A \times B) = \int_A \mu\left(B\mid x_1\right) \, \mu\left( \pi_1^{-1}(\mathrm{d} x_{1})\right).

The relation to conditional expectation is given by the identities

\operatorname E(f\mid \pi_1)(x_1)= \int_{X_2} f(x_1,x_2) \mu(\mathrm d x_2\mid x_1),

\mu(A\times B\mid \pi_1)(x_1)= 1_A(x_1) \cdot \mu(B\mid x_1).

=Vector calculus=

The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface {{nowrap|\Sigma \subset \mathbb{R}^3}}, it is implicit that the "correct" measure on \Sigma is the disintegration of three-dimensional Lebesgue measure \lambda^3 on \Sigma, and that the disintegration of this measure on ∂Σ is the same as the disintegration of \lambda^3 on \partial\Sigma.{{cite book |author1=Ambrosio, L. |author2=Gigli, N. |author3=Savaré, G. |title=Gradient Flows in Metric Spaces and in the Space of Probability Measures |publisher=ETH Zürich, Birkhäuser Verlag, Basel |year=2005 |isbn=978-3-7643-2428-5 }}

=Conditional distributions=

The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability.{{cite journal |last=Chang |first=J.T. |author2=Pollard, D. |title=Conditioning as disintegration |journal=Statistica Neerlandica |year=1997 |volume=51 |issue=3 |url=http://www.stat.yale.edu/~jtc5/papers/ConditioningAsDisintegration.pdf |doi=10.1111/1467-9574.00056 |page=287 |citeseerx=10.1.1.55.7544 |s2cid=16749932 }} The theorem is related to the Borel–Kolmogorov paradox, for example.

See also

  • {{annotated link|Ionescu-Tulcea theorem}}
  • {{annotated link|Joint probability distribution}}
  • {{annotated link|Copula (statistics)}}
  • {{annotated link|Conditional expectation}}
  • {{annotated link|Borel–Kolmogorov paradox}}
  • Regular conditional probability

References