Average-case complexity

In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs.

There are three primary motivations for studying average-case complexity.{{Cite journal |last1=Goldreich |first1=Oded |last2=Vadhan |first2=Salil |date=December 2007 |title=Special Issue On Worst-case Versus Average-case Complexity Editors' Foreword |journal=Computational Complexity |language=en |volume=16 |issue=4 |pages=325–330 |doi=10.1007/s00037-007-0232-y |issn=1016-3328|doi-access=free }} First, although some problems may be intractable in the worst-case, the inputs which elicit this behavior may rarely occur in practice, so the average-case complexity may be a more accurate measure of an algorithm's performance. Second, average-case complexity analysis provides tools and techniques to generate hard instances of problems which can be utilized in areas such as cryptography and derandomization. Third, average-case complexity allows discriminating the most efficient algorithm in practice among algorithms of equivalent best case complexity (for instance Quicksort).

Average-case analysis requires a notion of an "average" input to an algorithm, which leads to the problem of devising a probability distribution over inputs. Alternatively, a randomized algorithm can be used. The analysis of such algorithms leads to the related notion of an expected complexity.{{rp|28}}

History and background

The average-case performance of algorithms has been studied since modern notions of computational efficiency were developed in the 1950s. Much of this initial work focused on problems for which worst-case polynomial time algorithms were already known.{{Cite journal |last1=Bogdanov |first1=Andrej |last2=Trevisan |first2=Luca |date=2006 |title=Average-Case Complexity |url=http://www.nowpublishers.com/article/Details/TCS-004 |journal=Foundations and Trends in Theoretical Computer Science |language=en |volume=2 |issue=1 |pages=1–106 |doi=10.1561/0400000004 |issn=1551-305X}} In 1973, Donald Knuth{{cite book

}} published Volume 3 of the Art of Computer Programming which extensively surveys average-case performance of algorithms for problems solvable in worst-case polynomial time, such as sorting and median-finding.

An efficient algorithm for NP-complete problems is generally characterized as one which runs in polynomial time for all inputs; this is equivalent to requiring efficient worst-case complexity. However, an algorithm which is inefficient on a "small" number of inputs may still be efficient for "most" inputs that occur in practice. Thus, it is desirable to study the properties of these algorithms where the average-case complexity may differ from the worst-case complexity and find methods to relate the two.

The fundamental notions of average-case complexity were developed by Leonid Levin in 1986 when he published a one-page paper{{Cite journal |last=Levin |first=Leonid A. |date=February 1986 |title=Average Case Complete Problems |url=http://epubs.siam.org/doi/10.1137/0215020 |journal=SIAM Journal on Computing |language=en |volume=15 |issue=1 |pages=285–286 |doi=10.1137/0215020 |issn=0097-5397}} defining average-case complexity and completeness while giving an example of a complete problem for {{math|distNP}}, the average-case analogue of NP (complexity).

Definitions

=Efficient average-case complexity=

The first task is to precisely define what is meant by an algorithm which is efficient "on average". An initial attempt might define an efficient average-case algorithm as one which runs in expected polynomial time over all possible inputs. Such a definition has various shortcomings; in particular, it is not robust to changes in the computational model. For example, suppose algorithm {{mvar|A}} runs in time {{math|t_A(x)}} on input {{mvar|x}} and algorithm {{mvar|B}} runs in time {{math|t_A(x)²}} on input {{mvar|x}}; that is, {{mvar|B}} is quadratically slower than {{mvar|A}}. Intuitively, any definition of average-case efficiency should capture the idea that {{mvar|A}} is efficient-on-average if and only if {{mvar|B}} is efficient on-average. Suppose, however, that the inputs are drawn randomly from the uniform distribution of strings with length {{mvar|n}}, and that {{mvar|A}} runs in time {{math|n²}} on all inputs except the string {{math|1ⁿ}} for which {{mvar|A}} takes time {{math|2ⁿ}}. Then it can be easily checked that the expected running time of {{mvar|A}} is polynomial but the expected running time of {{mvar|B}} is exponential.

To create a more robust definition of average-case efficiency, it makes sense to allow an algorithm {{mvar|A}} to run longer than polynomial time on some inputs but the fraction of inputs on which {{mvar|A}} requires larger and larger running time becomes smaller and smaller. This intuition is captured in the following formula for average polynomial running time, which balances the polynomial trade-off between running time and fraction of inputs:

: $\Pr_{x \in_R D_n} \left[t_A(x) \geq t \right] \leq \frac{p(n)}{t^\epsilon}$

for every {{math|n, t > 0}} and polynomial {{mvar|p}}, where {{math|t_A(x)}} denotes the running time of algorithm {{mvar|A}} on input {{mvar|x}}, and {{mvar|ε}} is a positive constant value.{{cite book |last1=Wang |first1=Jie |url=https://www.cs.uml.edu/~wang/acc-forum/avgcomp.pdf |title=Complexity Theory: Retrospective II |date=1997 |publisher=Springer Science & Business Media |editor-last1=Hemaspaandra |editor-first1=Lane A. |volume=2 |pages=295–328 |chapter=Average-case computational complexity theory |editor-last2=Selman |editor-first2=Alan L.}} Alternatively, this can be written as

: $E_{x \in_R D_n} \left[ \frac{t_{A}(x)^{\epsilon}}{n} \right] \leq C$

for some constants {{mvar|C}} and {{mvar|ε}}, where {{math|n {{=}} {{abs|x}}}}.{{cite book |last1=Arora |first1=Sanjeev |title=Computational Complexity: A Modern Approach |last2=Barak |first2=Boaz |date=2009 |publisher=Cambridge University Press |location=Cambridge ; New York |chapter=18. Average case complexity: Levin’s theory}} In other words, an algorithm {{mvar|A}} has good average-case complexity if, after running for {{math|t_A(n)}} steps, {{mvar|A}} can solve all but a {{math|{{sfrac|n^c|(t_A(n))^ε}}}} fraction of inputs of length {{mvar|n}}, for some {{math|ε, c > 0}}.

=Distributional problem=

The next step is to define the "average" input to a particular problem. This is achieved by associating the inputs of each problem with a particular probability distribution. That is, an "average-case" problem consists of a language {{mvar|L}} and an associated probability distribution {{mvar|D}} which forms the pair {{math|(L, D)}}. The two most common classes of distributions which are allowed are:

Polynomial-time computable distributions ({{math|P}}-computable): these are distributions for which it is possible to compute the cumulative density of any given input {{mvar|x}}. More formally, given a probability distribution {{mvar|μ}} and a string {{math|x ∈ {{mset|0, 1}}ⁿ}} it is possible to compute the value $\mu(x) = \sum\limits_{y \in \{0, 1\}^n : y \leq x} \Pr[y]$ in polynomial time. This implies that {{math|Pr[x]}} is also computable in polynomial time.
Polynomial-time samplable distributions ({{math|P}}-samplable): these are distributions from which it is possible to draw random samples in polynomial time.

These two formulations, while similar, are not equivalent. If a distribution is {{math|P}}-computable it is also {{math|P}}-samplable, but the converse is not true if P (complexity) ≠ {{math|P^#P}}.

=AvgP and distNP=

A distributional problem {{math|(L, D)}} is in the complexity class {{math|AvgP}} if there is an efficient average-case algorithm for {{mvar|L}}, as defined above. The class {{math|AvgP}} is occasionally called {{math|distP}} in the literature.

A distributional problem {{math|(L, D)}} is in the complexity class {{math|distNP}} if {{mvar|L}} is in {{math|NP}} and {{mvar|D}} is {{math|P}}-computable. When {{mvar|L}} is in {{math|NP}} and {{mvar|D}} is {{math|P}}-samplable, {{math|(L, D)}} belongs to {{math|sampNP}}.

Together, {{math|AvgP}} and {{math|distNP}} define the average-case analogues of {{math|P}} and {{math|NP}}, respectively.

Reductions between distributional problems

Let {{math|(L,D)}} and {{math|(L′, D′)}} be two distributional problems. {{math|(L, D)}} average-case reduces to {{math|(L′, D′)}} (written {{math|(L, D) ≤_AvgP (L′, D′)}}) if there is a function {{mvar|f}} that for every {{mvar|n}}, on input {{mvar|x}} can be computed in time polynomial in {{mvar|n}} and

(Correctness) {{math|x ∈ L}} if and only if {{math|f(x) ∈ L′}}
(Domination) There are polynomials {{mvar|p}} and {{mvar|m}} such that, for every {{mvar|n}} and {{mvar|y}}, $\sum\limits_{x: f(x) = y} D_n(x) \leq p(n)D'_{m(n)}(y)$

The domination condition enforces the notion that if problem {{math|(L, D)}} is hard on average, then {{math|(L′, D′)}} is also hard on average. Intuitively, a reduction should provide a way to solve an instance {{mvar|x}} of problem {{mvar|L}} by computing {{math|f(x)}} and feeding the output to the algorithm which solves {{mvar|L'}}. Without the domination condition, this may not be possible since the algorithm which solves {{mvar|L}} in polynomial time on average may take super-polynomial time on a small number of inputs but {{mvar|f}} may map these inputs into a much larger set of {{mvar|D'}} so that algorithm {{mvar|A'}} no longer runs in polynomial time on average. The domination condition only allows such strings to occur polynomially as often in {{mvar|D'}}.

=DistNP-complete problems=

The average-case analogue to {{math|NP}}-completeness is {{math|distNP}}-completeness. A distributional problem {{math|(L′, D′)}} is {{math|distNP}}-complete if {{math|(L′, D′)}} is in {{math|distNP}} and for every {{math|(L, D)}} in {{math|distNP}}, {{math|(L, D)}} is average-case reducible to {{math|(L′, D′)}}.

An example of a {{math|distNP}}-complete problem is the Bounded Halting Problem, {{math|({{mvar|BH}},D)}} (for any {{math|P}}-computable D) defined as follows:

$BH = \{(M, x, 1^t) : M \text{ is a non-deterministic Turing machine that accepts } x \text{ in} \leq t \text{ steps}\}$

In his original paper, Levin showed an example of a distributional tiling problem that is average-case {{math|NP}}-complete. A survey of known {{math|distNP}}-complete problems is available online.

One area of active research involves finding new {{math|distNP}}-complete problems. However, finding such problems can be complicated due to a result of Gurevich which shows that any distributional problem with a flat distribution cannot be {{math|distNP}}-complete unless EXP = NEXP.{{Cite book |last=Gurevich |first=Yuri |chapter=Complete and incomplete randomized NP problems |date=October 1987 |title=28th Annual Symposium on Foundations of Computer Science (SFCS 1987) |chapter-url=https://ieeexplore.ieee.org/document/4568261 |pages=111–117 |doi=10.1109/SFCS.1987.14|isbn=0-8186-0807-2 }} (A flat distribution {{mvar|μ}} is one for which there exists an {{math|ε > 0}} such that for any {{mvar|x}}, {{math|μ(x) ≤ 2^{−{{abs|x}}^ε}}}.) A result by Livne shows that all natural {{math|NP}}-complete problems have {{math|DistNP}}-complete versions.{{Cite journal |last=Livne |first=Noam |date=December 2010 |title=All Natural NP-Complete Problems Have Average-Case Complete Versions |url=http://link.springer.com/10.1007/s00037-010-0298-9 |journal=Computational Complexity |language=en |volume=19 |issue=4 |pages=477–499 |doi=10.1007/s00037-010-0298-9 |issn=1016-3328}} However, the goal of finding a natural distributional problem that is {{math|DistNP}}-complete has not yet been achieved.{{Citation |last=Goldreich |first=Oded |title=Notes on Levin's Theory of Average-Case Complexity |date=2011 |work=Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation |series=Lecture Notes in Computer Science |volume=6650 |pages=233–247 |editor-last=Goldreich |editor-first=Oded |url=http://link.springer.com/10.1007/978-3-642-22670-0_21 |access-date=2025-05-21 |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |doi=10.1007/978-3-642-22670-0_21 |isbn=978-3-642-22669-4}}

Applications

=Sorting algorithms=

As mentioned above, much early work relating to average-case complexity focused on problems for which polynomial-time algorithms already existed, such as sorting. For example, many sorting algorithms which utilize randomness, such as Quicksort, have a worst-case running time of {{math|O(n²)}}, but an average-case running time of {{math|O(n log(n))}}, where {{mvar|n}} is the length of the input to be sorted.{{cite book | last1 = Cormen | first1 = Thomas H. | last2 = Leiserson | first2 = Charles E. | last3 = Rivest | first3 = Ronald L. | last4 = Stein | first4 = Clifford | title = Introduction to Algorithms | edition = 3rd | date = 2009 | orig-year = 1990 | publisher = MIT Press and McGraw-Hill | isbn=978-0-262-03384-8 |url=https://www.worldcat.org/title/311310321 |oclc=311310321}}

=Cryptography=

For most problems, average-case complexity analysis is undertaken to find efficient algorithms for a problem that is considered difficult in the worst-case. In cryptographic applications, however, the opposite is true: the worst-case complexity is irrelevant; we instead want a guarantee that the average-case complexity of every algorithm which "breaks" the cryptographic scheme is inefficient.{{Cite book |last1=Katz |first1=Jonathan |title=Introduction to modern cryptography |last2=Lindell |first2=Yehuda |date=2021 |publisher=CRC Press |isbn=978-1-351-13303-6 |edition=3 |series=Chapman & Hall/CRC cryptography and network security series |location=Boca Raton, FL}}{{pn|date=May 2025}}

Thus, all secure cryptographic schemes rely on the existence of one-way functions. Although the existence of one-way functions is still an open problem, many candidate one-way functions are based on hard problems such as integer factorization or computing the discrete log. Note that it is not desirable for the candidate function to be {{math|NP}}-complete since this would only guarantee that there is likely no efficient algorithm for solving the problem in the worst case; what we actually want is a guarantee that no efficient algorithm can solve the problem over random inputs (i.e. the average case). In fact, both the integer factorization and discrete log problems are in {{math|NP ∩ }}coNP, and are therefore not believed to be {{math|NP}}-complete. The fact that all of cryptography is predicated on the existence of average-case intractable problems in {{math|NP}} is one of the primary motivations for studying average-case complexity.

Other results

Yao's principle, from a 1978 paper by Andrew Yao, shows that for broad classes of computational problems, average-case complexity for a hard input distribution and a deterministic algorithm adapted to that distribution is the same thing as expected complexity for a fast randomized algorithm and its worst-case input.{{citation

| last = Yao | first = Andrew | author-link = Andrew Yao

| contribution = Probabilistic computations: Toward a unified measure of complexity

| doi = 10.1109/SFCS.1977.24

| pages = 222–227

| title = Proceedings of the 18th IEEE Symposium on Foundations of Computer Science (FOCS)

| year = 1977}}

In 1990, Impagliazzo and Levin showed that if there is an efficient average-case algorithm for a {{math|distNP}}-complete problem under the uniform distribution, then there is an average-case algorithm for every problem in {{math|NP}} under any polynomial-time samplable distribution.R. Impagliazzo and L. Levin, "No Better Ways to Generate Hard NP Instances than Picking Uniformly at Random," in Proceedings of the 31st IEEE Sympo- sium on Foundations of Computer Science, pp. 812–821, 1990. Applying this theory to natural distributional problems remains an outstanding open question.

In 1992, Ben-David et al. showed that if all languages in {{math|distNP}} have good-on-average decision algorithms, they also have good-on-average search algorithms. Further, they show that this conclusion holds under a weaker assumption: if every language in {{math|NP}} is easy on average for decision algorithms with respect to the uniform distribution, then it is also easy on average for search algorithms with respect to the uniform distribution.{{Cite book |last1=Ben-David |first1=S. |last2=Chor |first2=B. |last3=Goldreich |first3=O. |chapter=On the theory of average case complexity |date=1989 |title=Proceedings of the twenty-first annual ACM symposium on Theory of computing - STOC '89 |chapter-url=http://portal.acm.org/citation.cfm?doid=73007.73027 |language=en |publisher=ACM Press |pages=204–216 |doi=10.1145/73007.73027 |isbn=978-0-89791-307-2}} Thus, cryptographic one-way functions can exist only if there are {{math|distNP}} problems over the uniform distribution that are hard on average for decision algorithms.

In 1993, Feigenbaum and Fortnow showed that it is not possible to prove, under non-adaptive random reductions, that the existence of a good-on-average algorithm for a {{math|distNP}}-complete problem under the uniform distribution implies the existence of worst-case efficient algorithms for all problems in {{math|NP}}.{{Cite journal |last1=Feigenbaum |first1=Joan |last2=Fortnow |first2=Lance |date=October 1993 |title=Random-Self-Reducibility of Complete Sets |url=http://epubs.siam.org/doi/10.1137/0222061 |journal=SIAM Journal on Computing |language=en |volume=22 |issue=5 |pages=994–1005 |doi=10.1137/0222061 |issn=0097-5397}} In 2003, Bogdanov and Trevisan generalized this result to arbitrary non-adaptive reductions.{{Cite journal |last1=Bogdanov |first1=Andrej |last2=Trevisan |first2=Luca |date=January 2006 |title=On Worst-Case to Average-Case Reductions for NP Problems |url=https://epubs.siam.org/doi/10.1137/S0097539705446974 |journal=SIAM Journal on Computing |language=en |volume=36 |issue=4 |pages=1119–1159 |doi=10.1137/S0097539705446974 |issn=0097-5397}} These results show that it is unlikely that any association can be made between average-case complexity and worst-case complexity via reductions.