Quantile function

{{Short description|Statistical function that defines the quantiles of a probability distribution}}

File:Probit plot.svg is the quantile function of the normal distribution.]]

In probability and statistics, the quantile function is a function $Q: [0,1] \mapsto \mathbb{R}$ which maps some probability $x \in [0,1]$ of a random variable $v$ to the value of the variable $y$ such that $P(v\leq y) = x$ according to its probability distribution. In other words, the function returns the value of the variable below which the specified cumulative probability is contained. For example, if the distribution is a standard normal distribution then $Q(0.5)$ will return 0 as 0.5 of the probability mass is contained below 0.

The quantile function is also called the percentile function (after the percentile), percent-point function, inverse cumulative distribution function (after the cumulative distribution function or c.d.f.) or inverse distribution function.

Definition

=Strictly increasing distribution function=

With reference to a continuous and strictly increasing cumulative distribution function (c.d.f.) $F_X\colon \mathbb{R} \to [0,1]$ of a random variable {{mvar|X}}, the quantile function $Q\colon [0, 1] \to \mathbb{R}$ maps its input p to a threshold value {{mvar|x}} so that the probability of {{mvar|X}} being less or equal than {{mvar|x}} is {{mvar|p}}. In terms of the distribution function {{math|F}}, the quantile function {{math|Q}} returns the value {{mvar|x}} such that

$F_X(x) := \Pr(X \le x) = p,$

which can be written as inverse of the c.d.f.

$Q(p) = F_X^{-1}(p).$

File:Quantile distribution function.svg

=General distribution function=

In the general case of distribution functions that are not strictly monotonic and therefore do not permit an inverse c.d.f., the quantile is a (potentially) set valued functional of a distribution function {{mvar|F}}, given by the interval{{cite journal |last1=Ehm |first1=W. |last2=Gneiting |first2=T. |last3=Jordan |first3=A. |last4=Krüger |first4=F. |year=2016 |title=Of quantiles and expectiles: Consistent scoring functions, Choquet representations, and forecast rankings |journal=J. R. Stat. Soc. B |volume=78 |issue=3 |pages=505–562 |doi=10.1111/rssb.12154 |doi-access=free|arxiv=1503.08195 }}

$Q(p) = \big[\sup\{x \colon F(x) < p\}, \sup\{x \colon F(x) \le p \}\big].$

It is often standard to choose the lowest value, which can equivalently be written as (using right-continuity of {{mvar|F}})

$Q(p) = \inf \{x \in \mathbb{R} : p \le F(x)\}.$

Here we capture the fact that the quantile function returns the minimum value of {{mvar|x}} from amongst all those values whose c.d.f value exceeds {{mvar|p}}, which is equivalent to the previous probability statement in the special case that the distribution is continuous.

The quantile is the unique function satisfying the Galois inequalities

$Q(p) \le x$ if and only if $p \le F(x).$

If the function {{mvar|F}} is continuous and strictly monotonically increasing, then the inequalities can be replaced by equalities, and we have

$Q = F^{-1}.$

In general, even though the distribution function {{mvar|F}} may fail to possess a left or right inverse, the quantile function {{mvar|Q}} behaves as an "almost sure left inverse" for the distribution function, in the sense that

$Q\bigl(F(X)\bigr) = X \quad \text{almost surely.}$

Simple example

For example, the cumulative distribution function of Exponential(λ) (i.e. intensity {{mvar|λ}} and expected value (mean) {{math|1/λ}}) is

$F(x; \lambda) = \begin{cases}
1 - e^{-\lambda x} & x \ge 0, \\
0 & x < 0.
\end{cases}$

The quantile function for {{math|Exponential(λ)}} is derived by finding the value of {{math|Q}} for which $1 - e^{-\lambda Q} = p$ :

$Q(p; \lambda) = \frac{-\ln(1 - p)}{\lambda},$

for {{math|0 ≤ p < 1}}. The quartiles are therefore:

; first quartile ({{math|1=p = 1/4}}): $-\ln(3/4) / \lambda,$

; median ({{math|1=p = 2/4}}) : $-\ln(1/2) / \lambda,$

; third quartile ({{math|1=p = 3/4}}) : $-\ln(1/4) / \lambda.$

Applications

Quantile functions are used in both statistical applications and Monte Carlo methods.

The quantile function is one way of prescribing a probability distribution, and it is an alternative to the probability density function (pdf) or probability mass function, the cumulative distribution function (cdf) and the characteristic function. The quantile function, Q, of a probability distribution is the inverse of its cumulative distribution function F. The derivative of the quantile function, namely the quantile density function, is yet another way of prescribing a probability distribution. It is the reciprocal of the pdf composed with the quantile function.

Consider a statistical application where a user needs to know key percentage points of a given distribution. For example, they require the median and 25% and 75% quartiles as in the example above or 5%, 95%, 2.5%, 97.5% levels for other applications such as assessing the statistical significance of an observation whose distribution is known; see the quantile entry. Before the popularization of computers, it was not uncommon for books to have appendices with statistical tables sampling the quantile function.{{cite web|url=http://course.shufe.edu.cn/jpkc/jrjlx/ref/StaTable.pdf |title=Archived copy |access-date=March 25, 2012 |url-status=dead |archive-url=https://web.archive.org/web/20120324042025/http://course.shufe.edu.cn/jpkc/jrjlx/ref/StaTable.pdf |archive-date=March 24, 2012 }} Statistical applications of quantile functions are discussed extensively by Gilchrist.{{cite book|author=Gilchrist, W. |year=2000|title=Statistical Modelling with Quantile Functions|publisher=Taylor & Francis |isbn=1-58488-174-7}}

Monte-Carlo simulations employ quantile functions to produce non-uniform random or pseudorandom numbers for use in diverse types of simulation calculations. A sample from a given distribution may be obtained in principle by applying its quantile function to a sample from a uniform distribution. The demands of simulation methods, for example in modern computational finance, are focusing increasing attention on methods based on quantile functions, as they work well with multivariate techniques based on either copula or quasi-Monte-Carlo methods{{cite book|author=Jaeckel, P. |year=2002|title=Monte Carlo methods in finance}} and Monte Carlo methods in finance.

Calculation

The evaluation of quantile functions often involves numerical methods, such as the exponential distribution above, which is one of the few distributions where a closed-form expression can be found (others include the uniform, the Weibull, the Tukey lambda (which includes the logistic) and the log-logistic). When the cdf itself has a closed-form expression, one can always use a numerical root-finding algorithm such as the bisection method to invert the cdf. Other methods rely on an approximation of the inverse via interpolation techniques.

{{cite journal | last1 = Hörmann | first1 = Wolfgang | last2 = Leydold | first2 = Josef | title = Continuous random variate generation by fast numerical inversion| journal = ACM Transactions on Modeling and Computer Simulation | year= 2003 | volume = 13 | issue = 4 | doi = 10.1145/945511.945517|url=https://research.wu.ac.at/en/publications/continuous-random-variate-generation-by-fast-numerical-inversion-6|via=WU Vienna|access-date=17 June 2024|pages=347–362}}{{cite journal|last1=Derflinger|first1=Gerhard|last2=Hörmann|first2=Wolfgang|last3=Leydold|first3=Josef|title=Random variate generation by numerical inversion when only the density is known|doi=10.1145/1842722.1842723|year=2010|volume=20|issue=4|journal=ACM Transactions on Modeling and Computer Simulation|pages=1–25|url=https://epub.wu.ac.at/1112/1/document.pdf |article-number=18}} Further algorithms to evaluate quantile functions are given in the Numerical Recipes series of books. Algorithms for common distributions are built into many statistical software packages. General methods to numerically compute the quantile functions for general classes of distributions can be found in the following libraries:

C library UNU.RAN {{cite web | url=https://statmath.wu.ac.at/unuran/index.html | title=UNU.RAN - Universal Non-Uniform RANdom number generators }}
R library Runuran {{cite web | url=https://cran.r-project.org/package=Runuran | title=Runuran: R Interface to the 'UNU.RAN' Random Variate Generators | date=17 January 2023 }}
Python subpackage sampling in scipy.stats{{cite web | url=https://docs.scipy.org/doc/scipy/reference/stats.sampling.html | title=Random Number Generators (Scipy.stats.sampling) — SciPy v1.13.0 Manual }}

{{cite book | last1 = Baumgarten | first1 = Christoph | last2 = Patel | first2 = Tirth | chapter = Automatic random variate generation in Python | title = Proceedings of the 21st Python in Science Conference | date = 2022 | pages = 46–51 | doi = 10.25080/majora-212e5952-007| doi-access = free }}

Quantile functions may also be characterized as solutions of non-linear ordinary and partial differential equations. The ordinary differential equations for the cases of the normal, Student, beta and gamma distributions have been given and solved.{{cite journal|author1=Steinbrecher, G.|author2=Shaw, W.T. |year=2008|title=Quantile mechanics|journal=European Journal of Applied Mathematics|volume=19|issue=2|pages=87–112|doi=10.1017/S0956792508007341|doi-broken-date=17 March 2025 |s2cid=6899308 }}

=Normal distribution=

The normal distribution is perhaps the most important case. Because the normal distribution is a location-scale family, its quantile function for arbitrary parameters can be derived from a simple transformation of the quantile function of the standard normal distribution, known as the probit function. Unfortunately, this function has no closed-form representation using basic algebraic functions; as a result, approximate representations are usually used. Thorough composite rational and polynomial approximations have been given by Wichura{{cite journal|author=Wichura, M.J. |year=1988|title=Algorithm AS241: The Percentage Points of the Normal Distribution|journal=Applied Statistics|volume=37|pages=477–484|doi=10.2307/2347330|jstor=2347330|issue=3|publisher=Blackwell Publishing}} and Acklam.[http://home.online.no/~pjacklam/notes/invnorm/ An algorithm for computing the inverse normal cumulative distribution function] {{webarchive |url=https://web.archive.org/web/20070505093933/http://home.online.no/~pjacklam/notes/invnorm/ |date=May 5, 2007 }} Non-composite rational approximations have been developed by Shaw.[https://arxiv.org/abs/0901.0638 Computational Finance: Differential Equations for Monte Carlo Recycling]

==Ordinary differential equation for the normal quantile==

A non-linear ordinary differential equation for the normal quantile, {{math|w(p)}}, may be given. It is

$\frac{d^2 w}{d p^2} = w \left(\frac{d w}{d p}\right)^2$

with the centre (initial) conditions

$w\left(1/2\right) = 0,\,$

$w'\left(1/2\right) = \sqrt{2\pi}.\,$

This equation may be solved by several methods, including the classical power series approach. From this solutions of arbitrarily high accuracy may be developed (see Steinbrecher and Shaw, 2008).

=Student's ''t''-distribution=

This has historically been one of the more intractable cases, as the presence of a parameter, ν, the degrees of freedom, makes the use of rational and other approximations awkward. Simple formulas exist when the {{math|1=ν = 1, 2, 4}} and the problem may be reduced to the solution of a polynomial when ν is even. In other cases the quantile functions may be developed as power series.{{cite journal|author=Shaw, W.T. |year=2006|title=Sampling Student's T distribution – Use of the inverse cumulative distribution function.|journal=Journal of Computational Finance|volume=9|issue=4|pages=37–73|doi=10.21314/JCF.2006.150 }} The simple cases are as follows:

;ν = 1 (Cauchy distribution) : $Q(p) = \tan (\pi(p-1/2)) \!$

;ν = 2 : $Q(p) = 2(p-1/2)\sqrt{\frac{2}{\alpha}}\!$

;ν = 4 : $Q(p) = \operatorname{sign}(p-1/2)\,2\,\sqrt{q-1}\!$

where

$q = \frac{\cos \left( \frac{1}{3} \arccos \left( \sqrt{\alpha} \, \right) \right)}{\sqrt{\alpha}}\!$

and

$\alpha = 4p(1-p).\!$

In the above the "sign" function is +1 for positive arguments, −1 for negative arguments and zero at zero. It should not be confused with the trigonometric sine function.

Quantile mixtures

Analogously to the mixtures of densities, distributions can be defined as quantile mixtures

$Q(p) = \sum_{i=1}^m a_i Q_i(p),$

where $Q_i(p)$ , $i = 1,\ldots,m$ are quantile functions and {{nowrap| $a_i$ ,}} $i=1,\ldots,m$ are the model parameters. The parameters $a_i$ must be selected so that $Q(p)$ is a quantile function.

Two four-parametric quantile mixtures, the normal-polynomial quantile mixture and the Cauchy-polynomial quantile mixture, are presented by Karvanen.{{cite journal|last = Karvanen | first = J. |year=2006|title=Estimation of quantile mixtures via L-moments and trimmed L-moments.|journal=Computational Statistics & Data Analysis|volume=51|issue=2|pages=947–956|doi=10.1016/j.csda.2005.09.014}}

Non-linear differential equations for quantile functions

The non-linear ordinary differential equation given for normal distribution is a special case of that available for any quantile function whose second derivative exists. In general the equation for a quantile, {{math|Q(p)}}, may be given. It is

$\frac{d^2 Q}{d p^2} = H(Q) \left(\frac{d Q}{d p}\right)^2$

augmented by suitable boundary conditions, where

$H(x) = -\frac{f'(x)}{f(x)} = -\frac{d}{d x} \ln f(x)$

and {{math|f(x)}} is the probability density function. The forms of this equation, and its classical analysis by series and asymptotic solutions, for the cases of the normal, Student, gamma and beta distributions has been elucidated by Steinbrecher and Shaw (2008). Such solutions provide accurate benchmarks, and in the case of the Student, suitable series for live Monte Carlo use.