Weissman score
{{Short description|Performance metric for lossless compression}}
The Weissman score is a performance metric for lossless compression applications. It was developed by Tsachy Weissman, a professor at Stanford University, and Vinith Misra, a graduate student, at the request of producers for HBO's television series Silicon Valley, a television show about a fictional tech start-up working on a data compression algorithm.{{Cite web|url=https://spectrum.ieee.org/a-madefortv-compression-metric-moves-to-the-real-world|title=A Fictional Compression Metric Moves Into the Real World|last=Perry|first=Tekla|date=July 28, 2014|access-date=January 25, 2016}}{{cite web|url=https://spectrum.ieee.org/a-madefortv-compression-algorithm|title=A Made-For-TV Compression Algorithm|last=Perry|first=Tekla|date=July 25, 2014|access-date=January 25, 2016}}{{cite web|url=http://www.hollywoodreporter.com/live-feed/hbos-silicon-valley-tech-advisor-695513|title=HBO's 'Silicon Valley' Tech Advisor on Realism, Possible Elon Musk Cameo|last=Sandberg|first=Elise|date=April 12, 2014|work=The Hollywood Reporter|accessdate=June 10, 2014}}{{cite web|url=https://www.wsj.com/articles/SB10001424052702303987004579479244213599118|title=There's a New Geek in Town: HBO's 'Silicon Valley'|last1=Jurgensen|first1=John|last2=Rusli|first2=Evelyn M.|date=April 3, 2014|work=The Wall Street Journal|accessdate=June 10, 2014}} It compares both required time and compression ratio of measured applications, with those of a de facto standard according to the data type.
The formula is the following; where r is the compression ratio, T is the time required to compress, the overlined ones are the same metrics for a standard compressor, and alpha is a scaling constant.
The Weissman score has been used by Daniel Reiter Horn and Mehant Baid of Dropbox to explain real-world work on lossless compression. According to the authors it "favors compression speed over ratio in most cases."{{Cite web |title=Lossless compression with Brotli in Rust for a bit of Pied Piper on the backend |url=https://dropbox.tech/infrastructure/lossless-compression-with-brotli |access-date=2017-06-24 |website=Dropbox Tech Blog |language=en}}
Example
This example shows the score for the data of the Hutter Prize,{{Cite web|last=Hutter|first=Marcus|title=Contestants|date=July 2016|url=http://prize.hutter1.net/#contestants|accessdate=January 25, 2016}} using the paq8f as a standard and 1 as the scaling constant.
class="wikitable sortable"
!Application !Compression ratio !Compression time [min] !Weissman score |
paq8f
|5.467600 |300 |1.000000 |
---|
raq8g
|5.514990 |420 |0.720477 |
paq8hkcc
|5.682593 |300 |1.039321 |
paq8hp1
|5.692566 |300 |1.041145 |
paq8hp2
|5.750279 |300 |1.051701 |
paq8hp3
|5.800033 |300 |1.060801 |
paq8hp4
|5.868829 |300 |1.073826 |
paq8hp5
|5.917719 |300 |1.082325 |
paq8hp6
|5.976643 |300 |1.093102 |
paq8hp12
|6.104276 |540 |0.620247 |
decomp8
|6.261574 |540 |0.63623 |
decomp8
|6.276295 |540 |0.637726 |
Limitations
Although the value is relative to the standards against which it is compared, the unit used to measure the times changes the score (see examples 1 and 2). This is a consequence of the requirement that the argument of the logarithmic function must be dimensionless. The multiplier also can't have a numeric value of 1 or less, because the logarithm of 1 is 0 (examples 3 and 4), and the logarithm of any value less than 1 is negative (examples 5 and 6); that would result in scores of value 0 (even with changes), undefined, or negative (even if better than positive).
class="wikitable"
!rowspan="2"|# | colspan="3"|Standard compressor | colspan="3"|Scored compressor | rowspan="2"|Weissman score | rowspan="2"|Observations |
Compression ratio||Compression time||Log (compression time)||Compression ratio||Compression time||Log (compression time) | ||||
---|---|---|---|---|
1
|2.1||2 min||0.30103||3.4||3 min||0.477121||1×{{sfrac|3.4|2.1}}×{{sfrac|0.30103|0.477121}}=1.021506 |rowspan="2"|Change in unit or scale, changes the result. | ||||
2
|2.1||120 sec||2.079181||3.4||180 sec||2.255273||1×{{sfrac|3.4|2.1}}×{{sfrac|2.079181|2.255273}}=1.492632 | ||||
3
|2.2||1 min||0||3.3||1.5 min||0.176091||1×{{sfrac|3.3|2.2}}×{{sfrac|0|0.176091}}=0 |rowspan="2"|If time is 1, its log is 0; then the score can be 0 or ±∞. | ||||
4
|2.2||0.667 min||−0.176091||3.3||1 min||0||1×{{sfrac|3.3|2.2}}×{{sfrac|−0.176091|0}}=−∞ | ||||
5
|1.6||0.5 h||−0.30103||2.9||1.1 h||0.041393||1×{{sfrac|2.9|1.6}}×{{sfrac|−0.30103|0.041393}}=−13.18138 |rowspan="2"|If time is less than 1, its log is negative; then the score can be negative. | ||||
6
|1.6||1.1 h||0.041393||1.6||0.9 h||−0.045757||1×{{sfrac|1.6|1.6}}×{{sfrac|0.041393|−0.045757}}=−0.904627 |