log sum inequality

The log sum inequality is used for proving theorems in information theory.

Statement

Let a_1,\ldots,a_n and b_1,\ldots,b_n be nonnegative numbers. Denote the sum of all a_is by a and the sum of all b_is by b. The log sum inequality states that

:\sum_{i=1}^n a_i\log\frac{a_i}{b_i}\geq a\log\frac{a}{b},

with equality if and only if \frac{a_i}{b_i} are equal for all i, in other words a_i =c b_i for all i.{{sfnp|Cover|Thomas|1991|p=29}}

(Take a_i\log \frac{a_i}{b_i} to be 0 if a_i=0 and \infty if a_i>0, b_i=0. These are the limiting values obtained as the relevant number tends to 0.){{sfnp|Cover|Thomas|1991|p=29}}

Proof

Notice that after setting f(x)=x\log x we have

:

\begin{align}

\sum_{i=1}^n a_i\log\frac{a_i}{b_i} & {} = \sum_{i=1}^n b_i f\left(\frac{a_i}{b_i}\right)

= b\sum_{i=1}^n \frac{b_i}{b} f\left(\frac{a_i}{b_i}\right) \\

& {} \geq b f\left(\sum_{i=1}^n \frac{b_i}{b}\frac{a_i}{b_i}\right) = b f\left(\frac{1}{b}\sum_{i=1}^n a_i\right)

= b f\left(\frac{a}{b}\right) \\

& {} = a\log\frac{a}{b},

\end{align}

where the inequality follows from Jensen's inequality since \frac{b_i}{b}\geq 0, \sum_{i=1}^n\frac{b_i}{b}= 1, and f is convex.{{sfnp|Cover|Thomas|1991|p=29}}

Generalizations

The inequality remains valid for n=\infty provided that a<\infty and b<\infty.{{cn|date=July 2020}}

The proof above holds for any function g such that f(x)=xg(x) is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.

Another generalization is due to Dannan, Neff and Thiel, who showed that if a_1, a_2 \cdots a_n and b_1, b_2 \cdots b_n are positive real numbers with a_1+ a_2 \cdots +a_n=a and b_1 + b_2 \cdots +b_n=b, and k \geq 0, then \sum_{i=1}^n a_i \log\left( \frac{a_i}{b_i} +k \right) \geq a\log \left(\frac{a}{b}+k\right). {{cite journal |last1=F. M. Dannan, P. Neff, C. Thiel |title=On the sum of squared logarithms inequality and related inequalities |journal=Journal of Mathematical Inequalities |date=2016 |volume=10 |issue=1 |pages=1–17 |doi=10.7153/jmi-10-01 |s2cid=23953925 |url=http://files.ele-math.com/articles/jmi-10-01.pdf |access-date=12 January 2023}}

Applications

The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.{{sfnp|MacKay|2003|p=34}} One proof uses the log sum inequality.

:

class="toccolours collapsible collapsed" width="80%" style="text-align:left"

!Proof{{sfnp|Cover|Thomas|1991|p=29}}

Let P=(p_i)_{i\in\mathbb{N}} and Q=(q_i)_{i\in\mathbb{N}} be pmfs. In the log sum inequality, substitute n=\infty, a_i=p_i and b_i=q_i to get

:\mathbb{D}_{\mathrm{KL}}(P\|Q) \equiv \sum_{i} p_i \log_2 \frac{p_i}{q_i} \geq 1\log\frac{1}{1} = 0

with equality if and only if p_i=q_i for all i (as both P and Q sum to 1).

The inequality can also prove convexity of Kullback-Leibler divergence.{{sfnp|Cover|Thomas|1991|p=30}}

Notes

{{reflist}}

References

  • {{cite book |first1=Thomas M. |last1=Cover |first2=Joy A. |last2=Thomas |title=Elements of Information Theory |publisher=Wiley |location=Hoboken, New Jersey |isbn=978-0-471-24195-9|year=1991 }}
  • {{cite journal

|first1=I. |last1=Csiszár |authorlink1=Imre Csiszár

|first2=P. |last2=Shields

|year=2004

|title=Information Theory and Statistics: A Tutorial

|journal=Foundations and Trends in Communications and Information Theory

|volume=1 |issue=4 |pages=417–528

|doi=10.1561/0100000004

|url=http://www.renyi.hu/~csiszar/Publications/Information_Theory_and_Statistics:_A_Tutorial.pdf |accessdate=2009-06-14

}}

  • T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. {{isbn|0-8218-0534-7}}.
  • Information Theory course materials, Utah State University [http://ocw.usu.edu/Electrical_and_Computer_Engineering/Information_Theory/lecture3.pdf]. Retrieved on 2009-06-14.
  • {{cite book |last=MacKay|first=David J.C. |author-link=David J. C. MacKay|url=http://www.inference.phy.cam.ac.uk/mackay/itila/book.html|title=Information Theory, Inference, and Learning Algorithms|publisher=Cambridge University Press|year=2003|isbn=0-521-64298-1}}

Category:Inequalities (mathematics)

Category:Information theory

Category:Articles containing proofs