log sum inequality
The log sum inequality is used for proving theorems in information theory.
Statement
Let and be nonnegative numbers. Denote the sum of all s by and the sum of all s by . The log sum inequality states that
:
with equality if and only if are equal for all , in other words for all .{{sfnp|Cover|Thomas|1991|p=29}}
(Take to be if and if . These are the limiting values obtained as the relevant number tends to .){{sfnp|Cover|Thomas|1991|p=29}}
Proof
Notice that after setting we have
:
\begin{align}
\sum_{i=1}^n a_i\log\frac{a_i}{b_i} & {} = \sum_{i=1}^n b_i f\left(\frac{a_i}{b_i}\right)
= b\sum_{i=1}^n \frac{b_i}{b} f\left(\frac{a_i}{b_i}\right) \\
& {} \geq b f\left(\sum_{i=1}^n \frac{b_i}{b}\frac{a_i}{b_i}\right) = b f\left(\frac{1}{b}\sum_{i=1}^n a_i\right)
= b f\left(\frac{a}{b}\right) \\
& {} = a\log\frac{a}{b},
\end{align}
where the inequality follows from Jensen's inequality since , , and is convex.{{sfnp|Cover|Thomas|1991|p=29}}
Generalizations
The inequality remains valid for provided that and .{{cn|date=July 2020}}
The proof above holds for any function such that is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.
Another generalization is due to Dannan, Neff and Thiel, who showed that if and are positive real numbers with and , and , then . {{cite journal |last1=F. M. Dannan, P. Neff, C. Thiel |title=On the sum of squared logarithms inequality and related inequalities |journal=Journal of Mathematical Inequalities |date=2016 |volume=10 |issue=1 |pages=1–17 |doi=10.7153/jmi-10-01 |s2cid=23953925 |url=http://files.ele-math.com/articles/jmi-10-01.pdf |access-date=12 January 2023}}
Applications
The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.{{sfnp|MacKay|2003|p=34}} One proof uses the log sum inequality.
:
class="toccolours collapsible collapsed" width="80%" style="text-align:left"
!Proof{{sfnp|Cover|Thomas|1991|p=29}} |
Let and be pmfs. In the log sum inequality, substitute , and to get
: with equality if and only if for all i (as both and sum to 1). |
The inequality can also prove convexity of Kullback-Leibler divergence.{{sfnp|Cover|Thomas|1991|p=30}}
Notes
{{reflist}}
References
- {{cite book |first1=Thomas M. |last1=Cover |first2=Joy A. |last2=Thomas |title=Elements of Information Theory |publisher=Wiley |location=Hoboken, New Jersey |isbn=978-0-471-24195-9|year=1991 }}
- {{cite journal
|first1=I. |last1=Csiszár |authorlink1=Imre Csiszár
|first2=P. |last2=Shields
|year=2004
|title=Information Theory and Statistics: A Tutorial
|journal=Foundations and Trends in Communications and Information Theory
|volume=1 |issue=4 |pages=417–528
|doi=10.1561/0100000004
|url=http://www.renyi.hu/~csiszar/Publications/Information_Theory_and_Statistics:_A_Tutorial.pdf |accessdate=2009-06-14
}}
- T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. {{isbn|0-8218-0534-7}}.
- Information Theory course materials, Utah State University [http://ocw.usu.edu/Electrical_and_Computer_Engineering/Information_Theory/lecture3.pdf]. Retrieved on 2009-06-14.
- {{cite book |last=MacKay|first=David J.C. |author-link=David J. C. MacKay|url=http://www.inference.phy.cam.ac.uk/mackay/itila/book.html|title=Information Theory, Inference, and Learning Algorithms|publisher=Cambridge University Press|year=2003|isbn=0-521-64298-1}}