Chapman–Robbins bound

In statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both tighter and applicable to a wider range of problems. However, it is usually more difficult to compute.

The bound was independently discovered by John Hammersley in 1950,{{Citation

| last = Hammersley | first = J. M. |authorlink=John Hammersley

| title = On estimating restricted parameters

| journal = Journal of the Royal Statistical Society, Series B

| volume = 12 | issue = 2 | pages = 192–240 | year = 1950

| doi = 10.1111/j.2517-6161.1950.tb00056.x | mr = 40631

| jstor = 2983981

}} and by Douglas Chapman and Herbert Robbins in 1951.{{Citation

| last1 = Chapman | first1 = D. G.

| last2 = Robbins | first2 = H. | author2-link = Herbert Robbins

| title = Minimum variance estimation without regularity assumptions

| journal = Annals of Mathematical Statistics

| volume = 22 | issue =4 | pages =581–586 | year =1951

| doi = 10.1214/aoms/1177729548

| mr = 44084

| jstor = 2236927

| doi-access = free}}

Statement

Let $\Theta$ be the set of parameters for a family of probability distributions $\{\mu_\theta : \theta\in\Theta\}$ on $\Omega$ .

For any two $\theta, \theta' \in \Theta$ , let $\chi^2(\mu_{\theta'}; \mu_{\theta})$ be the $\chi^2$ -divergence from $\mu_{\theta}$ to $\mu_{\theta'}$ . Then:

{{Math theorem

| math_statement = Given any scalar random variable $\hat g: \Omega \to \R$ , and any two $\theta, \theta'\in\Theta$ , we have

$\operatorname{Var}_\theta[\hat g] \geq \sup_{\theta'\neq \theta \in \Theta}\frac{(E_{\theta'}[\hat g] - E_{\theta}[\hat g])^2}{\chi^2(\mu_{\theta'} ; \mu_\theta)}$ .

}}

A generalization to the multivariable case is:

{{Math theorem

| math_statement = Given any multivariate random variable $\hat g: \Omega \to \R^m$ , and any $\theta, \theta' \in\Theta$ ,

$\chi^2(\mu_{\theta'} ; \mu_\theta) \geq
(E_{\theta'}[\hat g] - E_{\theta}[\hat g])^T \operatorname{Cov}_\theta[\hat g]^{-1} (E_{\theta'}[\hat g] - E_{\theta}[\hat g])$

}}

Proof

By the variational representation of chi-squared divergence:{{Cite web |last=Polyanskiy |first=Yury |date=2017 |title=Lecture notes on information theory, chapter 29, ECE563 (UIUC) |url=https://people.lids.mit.edu/yp/homepage/data/LN_stats.pdf |url-status=live |archive-url=https://web.archive.org/web/20220524014051/https://people.lids.mit.edu/yp/homepage/data/LN_stats.pdf |archive-date=2022-05-24 |access-date=2022-05-24 |website=Lecture notes on information theory}} $\chi^2(P; Q) = \sup_g \frac{(E_P[g]-E_Q[g])^2}{\operatorname{Var}_Q[g]}$

Plug in $g = \hat g, P = \mu_{\theta'}, Q = \mu_\theta$ , to obtain: $\chi^2(\mu_{\theta'}; \mu_\theta) \geq \frac{(E_{\theta'}[\hat g]-E_\theta[\hat g])^2}{\operatorname{Var}_\theta[\hat g]}$ Switch the denominator and the left side and take supremum over $\theta'$ to obtain the single-variate case. For the multivariate case, we define $h = \sum_{i=1}^m v_i \hat g_i$ for any $v\neq 0 \in \R^m$ . Then plug in $g = h$ in the variational representation to obtain: $\chi^2(\mu_{\theta'}; \mu_\theta) \geq \frac{(E_{\theta'}[h]-E_\theta[h])^2}{\operatorname{Var}_\theta[h]} = \frac{\langle v, E_{\theta'}[\hat g]-E_\theta[\hat g]\rangle^2}{v^T \operatorname{Cov}_\theta[\hat g] v}$ Take supremum over $v\neq 0 \in\R^m$ , using the linear algebra fact that $\sup_{v\neq 0} \frac{v^T ww^T v}{v^T M v} = w^T M^{-1}w$ , we obtain the multivariate case.

Relation to Cramér–Rao bound

Usually, $\Omega = \mathcal X^n$ is the sample space of $n$ independent draws of a $\mathcal X$ -valued random variable $X$ with distribution $\lambda_\theta$ from a by $\theta \in \Theta \subseteq \mathbb R^m$ parameterized family of probability distributions, $\mu_\theta = \lambda_\theta^{\otimes n}$ is its $n$ -fold product measure, and $\hat g : \mathcal X^n \to \Theta$ is an estimator of $\theta$ . Then, for $m=1$ , the expression inside the supremum in the Chapman–Robbins bound converges to the Cramér–Rao bound of $\hat g$ when $\theta' \to \theta$ , assuming the regularity conditions of the Cramér–Rao bound hold. This implies that, when both bounds exist, the Chapman–Robbins version is always at least as tight as the Cramér–Rao bound; in many cases, it is substantially tighter.

The Chapman–Robbins bound also holds under much weaker regularity conditions. For example, no assumption is made regarding differentiability of the probability density function p(x; θ) of $\lambda_\theta$ . When p(x; θ) is non-differentiable, the Fisher information is not defined, and hence the Cramér–Rao bound does not exist.

Chapman–Robbins bound

Statement

Proof

Relation to Cramér–Rao bound

See also

References

Further reading