log sum inequality

The log sum inequality is used for proving theorems in information theory.

Statement

Let $a_1,\ldots,a_n$ and $b_1,\ldots,b_n$ be nonnegative numbers. Denote the sum of all $a_i$ s by $a$ and the sum of all $b_i$ s by $b$ . The log sum inequality states that

: $\sum_{i=1}^n a_i\log\frac{a_i}{b_i}\geq a\log\frac{a}{b},$

with equality if and only if $\frac{a_i}{b_i}$ are equal for all $i$ , in other words $a_i =c b_i$ for all $i$ .{{sfnp|Cover|Thomas|1991|p=29}}

(Take $a_i\log \frac{a_i}{b_i}$ to be $0$ if $a_i=0$ and $\infty$ if $a_i>0, b_i=0$ . These are the limiting values obtained as the relevant number tends to $0$ .){{sfnp|Cover|Thomas|1991|p=29}}

Proof

Notice that after setting $f(x)=x\log x$ we have

: $\begin{align}
\sum_{i=1}^n a_i\log\frac{a_i}{b_i} & {} = \sum_{i=1}^n b_i f\left(\frac{a_i}{b_i}\right)
= b\sum_{i=1}^n \frac{b_i}{b} f\left(\frac{a_i}{b_i}\right) \\
& {} \geq b f\left(\sum_{i=1}^n \frac{b_i}{b}\frac{a_i}{b_i}\right) = b f\left(\frac{1}{b}\sum_{i=1}^n a_i\right)
= b f\left(\frac{a}{b}\right) \\
& {} = a\log\frac{a}{b},
\end{align}$

where the inequality follows from Jensen's inequality since $\frac{b_i}{b}\geq 0$ , $\sum_{i=1}^n\frac{b_i}{b}= 1$ , and $f$ is convex.{{sfnp|Cover|Thomas|1991|p=29}}

Generalizations

The inequality remains valid for $n=\infty$ provided that $a<\infty$ and $b<\infty$ .{{cn|date=July 2020}}

The proof above holds for any function $g$ such that $f(x)=xg(x)$ is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.

Another generalization is due to Dannan, Neff and Thiel, who showed that if $a_1, a_2 \cdots a_n$ and $b_1, b_2 \cdots b_n$ are positive real numbers with $a_1+ a_2 \cdots +a_n=a$ and $b_1 + b_2 \cdots +b_n=b$ , and $k \geq 0$ , then $\sum_{i=1}^n a_i \log\left( \frac{a_i}{b_i} +k \right) \geq a\log \left(\frac{a}{b}+k\right)$ . {{cite journal |last1=F. M. Dannan, P. Neff, C. Thiel |title=On the sum of squared logarithms inequality and related inequalities |journal=Journal of Mathematical Inequalities |date=2016 |volume=10 |issue=1 |pages=1–17 |doi=10.7153/jmi-10-01 |s2cid=23953925 |url=http://files.ele-math.com/articles/jmi-10-01.pdf |access-date=12 January 2023}}

Applications

The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.{{sfnp|MacKay|2003|p=34}} One proof uses the log sum inequality.

class="toccolours collapsible collapsed" width="80%" style="text-align:left"

!Proof{{sfnp|Cover|Thomas|1991|p=29}}

Let

P=(p_i)_{i\in\mathbb{N}}

and

Q=(q_i)_{i\in\mathbb{N}}

be pmfs. In the log sum inequality, substitute

n=\infty

a_i=p_i

and

b_i=q_i

to get

: $\mathbb{D}_{\mathrm{KL}}(P\|Q) \equiv \sum_{i} p_i \log_2 \frac{p_i}{q_i} \geq 1\log\frac{1}{1} = 0$

with equality if and only if $p_i=q_i$ for all i (as both $P$ and $Q$ sum to 1).

The inequality can also prove convexity of Kullback-Leibler divergence.{{sfnp|Cover|Thomas|1991|p=30}}

Notes

References

{{cite book |first1=Thomas M. |last1=Cover |first2=Joy A. |last2=Thomas |title=Elements of Information Theory |publisher=Wiley |location=Hoboken, New Jersey |isbn=978-0-471-24195-9|year=1991 }}
{{cite journal

|first1=I. |last1=Csiszár |authorlink1=Imre Csiszár

|first2=P. |last2=Shields

|year=2004

|title=Information Theory and Statistics: A Tutorial

|journal=Foundations and Trends in Communications and Information Theory

|volume=1 |issue=4 |pages=417–528

|doi=10.1561/0100000004

|url=http://www.renyi.hu/~csiszar/Publications/Information_Theory_and_Statistics:_A_Tutorial.pdf |accessdate=2009-06-14

}}

T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. {{isbn|0-8218-0534-7}}.
Information Theory course materials, Utah State University [http://ocw.usu.edu/Electrical_and_Computer_Engineering/Information_Theory/lecture3.pdf]. Retrieved on 2009-06-14.
{{cite book |last=MacKay|first=David J.C. |author-link=David J. C. MacKay|url=http://www.inference.phy.cam.ac.uk/mackay/itila/book.html|title=Information Theory, Inference, and Learning Algorithms|publisher=Cambridge University Press|year=2003|isbn=0-521-64298-1}}

Category:Inequalities (mathematics)

Category:Information theory

Category:Articles containing proofs