Bernstein inequalities (probability theory)

{{Short description|Inequalities in probability theory}}

In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X1, ..., Xn be independent Bernoulli random variables taking values +1 and −1 with probability 1/2 (this distribution is also known as the Rademacher distribution), then for every positive \varepsilon,

:\mathbb{P}\left (\left|\frac{1}{n}\sum_{i=1}^n X_i\right| > \varepsilon \right ) \leq 2\exp \left (-\frac{n\varepsilon^2}{2(1+\frac{\varepsilon}{3})} \right).

Bernstein inequalities were proven and published by Sergei Bernstein in the 1920s and 1930s.S.N.Bernstein, "On a modification of Chebyshev's inequality and of the error formula of Laplace" vol. 4, #5 (original publication: Ann. Sci. Inst. Sav. Ukraine, Sect. Math. 1, 1924){{cite journal |last=Bernstein |first=S. N. |year=1937 |title=Об определенных модификациях неравенства Чебышева |trans-title=On certain modifications of Chebyshev's inequality |journal=Doklady Akademii Nauk SSSR |volume=17 |issue=6 |pages=275–277}}S.N.Bernstein, "Theory of Probability" (Russian), Moscow, 1927J.V.Uspensky, "Introduction to Mathematical Probability", McGraw-Hill Book Company, 1937 Later, these inequalities were rediscovered several times in various forms. Thus, special cases of the Bernstein inequalities are also known as the Chernoff bound, Hoeffding's inequality and Azuma's inequality.

The martingale case of the Bernstein inequality

is known as Freedman's inequality {{cite journal |title= On tail probabilities for martingales| first1=D.A. |last1=Freedman | journal=Ann. Probab. | year=1975| volume=3 | pages= 100–118| doi=10.1214/aop/1176996452 }} and its refinement

is known as Hoeffding's inequality.{{cite journal |title=Hoeffding's inequality for supermartingales| first1=X. |last1=Fan| first2=I. |last2=Grama | first3=Q. |last3=Liu | journal=Stochastic Process. Appl. | year=2012| volume=122 | issue=10 | pages=3545–3559 | doi=10.1016/j.spa.2012.06.009 | arxiv=1109.4359 }}

Some of the inequalities

1. Let X_1, \ldots, X_n be independent zero-mean random variables. Suppose that |X_i|\leq M almost surely, for all i. Then, for all positive t,

:\mathbb{P} \left (\sum_{i=1}^n X_i \geq t \right ) \leq \exp \left ( -\frac{\tfrac{1}{2} t^2}{\sum_{i = 1}^n \mathbb{E} \left[X_i^2 \right ]+\tfrac{1}{3} Mt} \right ).

2. Let X_1, \ldots, X_n be independent zero-mean random variables. Suppose that for some positive real L and every integer k \geq 2,

: \mathbb{E} \left[ \left |X_i^k \right |\right ] \leq \frac{1}{2} \mathbb{E} \left[X_i^2\right] L^{k-2} k!

Then

:\mathbb{P} \left (\sum_{i=1}^n X_i \geq 2t \sqrt{\sum \mathbb{E} \left [X_i^2 \right ]} \right ) < \exp(-t^2), \qquad \text{for}\quad 0 \leq t \leq \frac{1}{2L}\sqrt{\sum \mathbb{E} \left[X_j^2\right ]}.

3. Let X_1, \ldots, X_n be independent zero-mean random variables. Suppose that

: \mathbb{E} \left[ \left |X_i^k \right |\right ] \leq \frac{k!}{4!} \left(\frac{L}{5}\right)^{k-4}

for all integer k \geq 4. Denote

: A_k = \sum \mathbb{E} \left [ X_i^k\right ].

Then,

: \mathbb{P} \left( \left| \sum_{j=1}^n X_j - \frac{A_3 t^2}{3A_2} \right|\geq \sqrt{2A_2} \, t \left[ 1 + \frac{A_4 t^2}{6 A_2^2} \right] \right) < 2 \exp (- t^2), \qquad \text{for} \quad 0 < t \leq \frac{5 \sqrt{2A_2}}{4L}.

4. Bernstein also proved generalizations of the inequalities above to weakly dependent random variables. For example, inequality (2) can be extended as follows. Let X_1, \ldots, X_n be possibly non-independent random variables. Suppose that for all integers i>0,

:

\begin{align}

\mathbb{E} \left. \left [ X_i \right | X_1, \ldots, X_{i-1} \right ] &= 0, \\

\mathbb{E} \left. \left [ X_i^2 \right | X_1, \ldots, X_{i-1} \right ] &\leq R_i \mathbb{E} \left [ X_i^2 \right ], \\

\mathbb{E} \left. \left [ X_i^k \right | X_1, \ldots, X_{i-1} \right ] &\leq \tfrac{1}{2} \mathbb{E} \left. \left[ X_i^2 \right | X_1, \ldots, X_{i-1} \right ] L^{k-2} k!

\end{align}

Then

:\mathbb{P} \left( \sum_{i=1}^n X_i \geq 2t \sqrt{\sum_{i=1}^n R_i \mathbb{E}\left [ X_i^2 \right ]} \right) < \exp(-t^2), \qquad \text{for}\quad 0 < t \leq \frac{1}{2L} \sqrt{\sum_{i=1}^n R_i \mathbb{E} \left [X_i^2 \right ]}.

More general results for martingales can be found in Fan et al. (2015).{{cite journal |title=Exponential inequalities for martingales with applications| first1=X. |last1=Fan| first2=I. |last2=Grama | first3=Q. |last3=Liu | journal=Electronic Journal of Probability |publisher=Electron. J. Probab. 20| year=2015| volume=20 | pages=1–22| url=http://projecteuclid.org/euclid.ejp/1465067107 | doi=10.1214/EJP.v20-3496|arxiv=1311.6273| s2cid=119713171 }}

Proofs

The proofs are based on an application of Markov's inequality to the random variable

: \exp \left ( \lambda \sum_{j=1}^n X_j \right ),

for a suitable choice of the parameter \lambda > 0.

Generalizations

The Bernstein inequality can be generalized to Gaussian random matrices. Let G = g^H A g + 2 \operatorname{Re}(g^H a) be a scalar where A is a complex Hermitian matrix and a is complex vector of size N. The vector g \sim \mathcal{CN}(0,I) is a Gaussian vector of size N. Then for any \sigma \geq 0, we have

:\mathbb{P} \left( G \leq \operatorname{tr}(A) - \sqrt{2\sigma}\sqrt{\Vert \operatorname{vec}(A) \Vert^2 + 2 \Vert a \Vert^2 } - \sigma s^-(A) \right) < \exp(-\sigma),

where \operatorname{vec} is the vectorization operation and s^- (A) = \max(-\lambda_{\max}(A),0) where \lambda_{\max}(A) is the largest eigenvalue of A. The proof is detailed here.{{cite arXiv |title=A Bernstein-type inequality for stochastic processes of quadratic forms of Gaussian variables| first1=Bechar |last1= Ikhlef | year=2009| class=math.ST | eprint=0909.3595 }} Another similar inequality is formulated as

:\mathbb{P} \left( G \geq \operatorname{tr}(A) + \sqrt{2\sigma}\sqrt{\Vert \operatorname{vec}(A) \Vert^2 + 2 \Vert a \Vert^2 } + \sigma s^+(A) \right) < \exp(-\sigma),

where s^+(A) = \max(\lambda_{\max}(A),0).

See also

References

{{reflist|25em}}

(according to: S.N.Bernstein, Collected Works, Nauka, 1964)

A modern translation of some of these results can also be found in {{SpringerEOM| title=Bernstein inequality | id=Bernstein_inequality | oldid=15217 | first=A.V. | last=Prokhorov | first2=N.P. | last2=Korneichuk | first3=V.P. | last3=Motornyi }}

{{DEFAULTSORT:Bernstein Inequalities (Probability Theory)}}

Category:Probabilistic inequalities