Generalized Pareto distribution

{{Short description|Family of probability distributions often used to model tails or extreme values}}

{{About|a particular family of continuous distributions referred to as the generalized Pareto distribution|the hierarchy of generalized Pareto distributions|Pareto distribution}}

{{More citations needed|date=March 2012}}

{{Probability distribution

| name =Generalized Pareto distribution

| type =density

| pdf_image = File:Gpdpdf.svg

| pdf_caption = GPD distribution functions for \mu=0 and different values of \sigma and \xi

| cdf_image =File:Gpdcdf.svg

| parameters =

\mu \in (-\infty,\infty) \, location (real)

\sigma \in (0,\infty) \, scale (real)

\xi\in (-\infty,\infty) \, shape (real)

| support =x \geqslant \mu\,\;(\xi \geqslant 0)

\mu \leqslant x \leqslant \mu-\sigma/\xi\,\;(\xi < 0)

| pdf =\frac{1}{\sigma}(1 + \xi z )^{-(1/\xi +1)}

where z=\frac{x-\mu}{\sigma}

| cdf =1-(1+\xi z)^{-1/\xi} \,

| mean =\mu + \frac{\sigma}{1-\xi}\, \; (\xi < 1)

| median =\mu + \frac{\sigma( 2^{\xi} -1)}{\xi}

| entropy =\log(\sigma) + \xi + 1

| mode =\mu

| skewness =\frac{2(1+\xi)\sqrt{1-2\xi}}{(1-3\xi)}\,\;(\xi<1/3)

| kurtosis =\frac{3(1-2\xi)(2\xi^2+\xi+3)}{(1-3\xi)(1-4\xi)}-3\,\;(\xi<1/4)

| mgf =e^{\theta\mu}\,\sum_{j=0}^\infty \left[\frac{(\theta\sigma)^j}{\prod_{k=0}^j(1-k\xi)}\right], \;(k\xi<1)|

| cf =e^{it\mu}\,\sum_{j=0}^\infty \left[\frac{(it\sigma)^j}{\prod_{k=0}^j(1-k\xi)}\right], \;(k\xi<1)

| variance =\frac{\sigma^2}{(1-\xi)^2(1-2\xi)}\, \; (\xi < 1/2)

| moments =\xi = \frac{1}{2}\left(1 - \frac{(E[X] - \mu)^2}{V[X]}\right)
\sigma = (E[X] - \mu)(1 - \xi)

| ES =\begin{cases}\mu + \sigma\left[ \frac{(1-p)^{-\xi} }{1-\xi} + \frac{(1-p)^{-\xi} -1 }{\xi} \right]&,\xi \neq 0\\\mu + \sigma[1- \ln(1-p) ]&,\xi =0\end{cases}{{cite journal |last1=Norton |first1=Matthew |last2=Khokhlov |first2=Valentyn |last3=Uryasev |first3=Stan |year=2019 |title=Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation |journal=Annals of Operations Research |volume=299 |issue=1–2 |pages=1281–1315 |publisher=Springer |doi=10.1007/s10479-019-03373-1 |arxiv=1811.11301 |s2cid=254231768 |url=http://uryasev.ams.stonybrook.edu/wp-content/uploads/2019/10/Norton2019_CVaR_bPOE.pdf |access-date=2023-02-27 |archive-date=2023-03-31 |archive-url=https://web.archive.org/web/20230331230821/http://uryasev.ams.stonybrook.edu/wp-content/uploads/2019/10/Norton2019_CVaR_bPOE.pdf |url-status=dead }}

| bPOE =\begin{cases}\frac{ \left(1+\frac{\xi(x-\mu)}{\sigma}\right)^{- \frac{1}{\xi} } }{(1-\xi)^{ \frac{1}{\xi} } } &,\xi \neq 0\\\ e^{ 1 - \left( \frac{x-\mu}{\sigma} \right) }&,\xi =0\end{cases}

}}

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location \mu, scale \sigma, and shape \xi.{{Cite book |title=An Introduction to Statistical Modeling of Extreme Values |last=Coles |first=Stuart |publisher=Springer |page=75 |url=https://books.google.com/books?id=2nugUEaKqFEC |isbn=9781852334598 |date=2001-12-12}}{{Cite journal | last1 = Dargahi-Noubary | first1 = G. R. | title = On tail estimation: An improved method | doi = 10.1007/BF00894450 | journal = Mathematical Geology | volume = 21 | issue = 8 | pages = 829–842 | year = 1989 | bibcode = 1989MatGe..21..829D | s2cid = 122710961 }} Sometimes it is specified by only scale and shape{{Cite journal | last1 = Hosking | first1 = J. R. M. | last2 = Wallis | first2 = J. R. | title = Parameter and Quantile Estimation for the Generalized Pareto Distribution | journal = Technometrics | volume = 29 | issue = 3 | pages = 339–349 | doi = 10.2307/1269343 | year = 1987 | jstor = 1269343 }} and sometimes only by its shape parameter. Some references give the shape parameter as \kappa = - \xi \,.{{Cite book |title=Statistical Extremes and Applications |editor-last=de Oliveira |editor-first=J. Tiago |publisher=Kluwer |last=Davison |first=A. C. |chapter=Modelling Excesses over High Thresholds, with an Application |page=462 |chapter-url=https://books.google.com/books?id=6M03_6rm8-oC&pg=PA462 |isbn=9789027718044 |date=1984-09-30}}

Definition

The standard cumulative distribution function (cdf) of the GPD is defined by{{Cite book |last1=Embrechts |first1=Paul |last2=Klüppelberg |first2=Claudia|author2-link= Claudia Klüppelberg |last3=Mikosch |first3=Thomas |title=Modelling extremal events for insurance and finance |page=162 |url=https://books.google.com/books?id=BXOI2pICfJUC |isbn=9783540609315 |date=1997-01-01|publisher=Springer }}

: F_{\xi}(z) = \begin{cases}

1 - \left(1 + \xi z\right)^{-1/\xi} & \text{for }\xi \neq 0, \\

1 - e^{-z} & \text{for }\xi = 0.

\end{cases}

where the support is z \geq 0 for \xi \geq 0 and 0 \leq z \leq - 1 /\xi for \xi < 0. The corresponding probability density function (pdf) is

: f_{\xi}(z) = \begin{cases}

(1 + \xi z)^{-\frac{\xi +1}{\xi }} & \text{for }\xi \neq 0, \\

e^{-z} & \text{for }\xi = 0.

\end{cases}

Characterization

The related location-scale family of distributions is obtained by replacing the argument z by \frac{x-\mu}{\sigma} and adjusting the support accordingly.

The cumulative distribution function of X \sim GPD(\mu, \sigma, \xi) (\mu\in\mathbb R, \sigma>0, and \xi\in\mathbb R) is

: F_{(\mu,\sigma,\xi)}(x) = \begin{cases}

1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} & \text{for }\xi \neq 0, \\

1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0,

\end{cases}

where the support of X is x \geqslant \mu when \xi \geqslant 0 \,, and \mu \leqslant x \leqslant \mu - \sigma /\xi when \xi < 0.

The probability density function (pdf) of X \sim GPD(\mu, \sigma, \xi) is

: f_{(\mu,\sigma,\xi)}(x) = \frac{1}{\sigma}\left(1 + \frac{\xi (x-\mu)}{\sigma}\right)^{\left(-\frac{1}{\xi} - 1\right)},

again, for x \geqslant \mu when \xi \geqslant 0, and \mu \leqslant x \leqslant \mu - \sigma /\xi when \xi < 0.

The pdf is a solution of the following differential equation: {{Citation needed|date=December 2019}}

:\left\{\begin{array}{l}

f'(x) (-\mu \xi +\sigma+\xi x)+(\xi+1) f(x)=0, \\

f(0)=\frac{\left(1-\frac{\mu \xi}{\sigma}\right)^{-\frac{1}{\xi }-1}}{\sigma}

\end{array}\right\}

Special cases

  • If the shape \xi and location \mu are both zero, the GPD is equivalent to the exponential distribution.
  • With shape \xi = -1, the GPD is equivalent to the continuous uniform distribution U(0, \sigma).Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.
  • With shape \xi > 0 and location \mu = \sigma/\xi, the GPD is equivalent to the Pareto distribution with scale x_m=\sigma/\xi and shape \alpha=1/\xi.
  • If X \sim GPD (\mu = 0, \sigma, \xi ), then Y = \log (X) \sim exGPD(\sigma, \xi) [https://www.tandfonline.com/doi/abs/10.1080/03610926.2018.1441418]. (exGPD stands for the exponentiated generalized Pareto distribution.)
  • GPD is similar to the Burr distribution.

Generating generalized Pareto random variables

= Generating GPD random variables =

If U is uniformly distributed on

(0, 1], then

: X = \mu + \frac{\sigma (U^{-\xi}-1)}{\xi} \sim GPD(\mu, \sigma, \xi \neq 0)

and

: X = \mu - \sigma \ln(U) \sim GPD(\mu,\sigma,\xi =0).

Both formulas are obtained by inversion of the cdf.

In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.

= GPD as an Exponential-Gamma Mixture =

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.

:\ X\ \vert\ \Lambda \sim \operatorname\mathsf{Exp}(\Lambda)\

and

:\ \Lambda \sim \operatorname\mathsf{Gamma}(\alpha,\ \beta)\

then

:\ X \sim \operatorname\mathsf{GPD}(\ \xi = 1/\alpha,\ \sigma = \beta/\alpha\ )\

Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that \ \xi\ must be positive.

In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for \ Y \sim \operatorname\mathsf{Exponential}(\ 1\ )\ and \ Z \sim \operatorname\mathsf{Gamma}(1/\xi,\ 1)\ , we have \ \mu + \frac{\ \sigma\ Y\ }{\ \xi\ Z\ } \sim \operatorname\mathsf{GPD}(\mu,\ \sigma,\ \xi) ~. This is a consequence of the mixture after setting \ \beta = \alpha\ and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.

Exponentiated generalized Pareto distribution

= The exponentiated generalized Pareto distribution (exGPD) =

File:ExGPDpdf.png

If X \sim GPD (\mu = 0, \sigma, \xi ), then Y = \log (X) is distributed according to the [https://www.tandfonline.com/doi/abs/10.1080/03610926.2018.1441418 exponentiated generalized Pareto distribution], denoted by Y \sim exGPD (\sigma, \xi ).

The probability density function(pdf) of Y \sim exGPD (\sigma, \xi )\,\, (\sigma >0) is

: g_{(\sigma, \xi)}(y) = \begin{cases} \frac{e^y}{\sigma}\bigg( 1 + \frac{\xi e^y}{\sigma} \bigg)^{-1/\xi -1}\,\,\,\, \text{for } \xi \neq 0, \\

\frac{1}{\sigma}e^{y - e^{y}/\sigma} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\, \text{for } \xi = 0 ,\end{cases}

where the support is -\infty < y < \infty for \xi \geq 0 , and -\infty < y \leq \log(-\sigma/\xi) for \xi < 0 .

For all \xi, the \log \sigma becomes the location parameter. See the right panel for the pdf when the shape \xi is positive.

The exGPD has finite moments of all orders for all \sigma>0 and -\infty< \xi < \infty .

File:Var exGPD.png of the exGPD(\sigma,\xi) as a function of \xi. Note that the variance only depends on \xi. The red dotted line represents the variance evaluated at \xi=0, that is, \psi'(1) = \pi^2/6.]]

The moment-generating function of Y \sim exGPD(\sigma,\xi) is

: M_Y(s) = E[e^{sY}] = \begin{cases} -\frac{1}{\xi}\bigg(-\frac{\sigma}{\xi}\bigg)^{s} B(s+1, -1/\xi) \,\,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, \infty), \xi < 0 , \\

\frac{1}{\xi}\bigg(\frac{\sigma}{\xi}\bigg)^{s} B(s+1, 1/\xi - s) \,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, 1/\xi), \xi > 0 , \\

\sigma^{s} \Gamma(1+s) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, \infty), \xi = 0, \end{cases}

where B(a,b) and \Gamma (a) denote the beta function and gamma function, respectively.

The expected value of Y \sim exGPD (\sigma, \xi ) depends on the scale \sigma and shape \xi parameters, while the \xi participates through the digamma function:

: E[Y] = \begin{cases} \log\ \bigg(-\frac{\sigma}{\xi} \bigg)+ \psi(1) - \psi(-1/\xi+1) \,\,\,\,\,\,\,\,\,\,\,\, \,\, \text{for }\xi < 0 , \\

\log\ \bigg(\frac{\sigma}{\xi} \bigg)+ \psi(1) - \psi(1/\xi) \,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\, \,\,\, \text{for }\xi > 0 , \\

\log \sigma + \psi(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\,\,\,\,\, \text{for }\xi = 0. \end{cases}

Note that for a fixed value for the \xi \in (-\infty,\infty) , the \log\ \sigma plays as the location parameter under the exponentiated generalized Pareto distribution.

The variance of Y \sim exGPD (\sigma, \xi ) depends on the shape parameter \xi only through the polygamma function of order 1 (also called the trigamma function):

: Var[Y] = \begin{cases} \psi'(1) - \psi'(-1/\xi +1) \,\,\,\,\,\,\,\,\,\,\,\, \, \text{for }\xi < 0 , \\

\psi'(1) + \psi'(1/\xi) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \text{for }\xi > 0 , \\

\psi'(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\text{for }\xi = 0. \end{cases}

See the right panel for the variance as a function of \xi. Note that \psi'(1) = \pi^2/6 \approx 1.644934 .

Note that the roles of the scale parameter \sigma and the shape parameter \xi under Y \sim exGPD(\sigma, \xi) are separably interpretable, which may lead to a robust efficient estimation for the \xi than using the X \sim GPD(\sigma, \xi) [https://www.tandfonline.com/doi/abs/10.1080/03610926.2018.1441418]. The roles of the two parameters are associated each other under X \sim GPD(\mu=0,\sigma, \xi) (at least up to the second central moment); see the formula of variance Var(X) wherein both parameters are participated.

The Hill's estimator

Assume that X_{1:n} = (X_1, \cdots, X_n) are n observations (need not be i.i.d.) from an unknown heavy-tailed distribution F such that its tail distribution is regularly varying with the tail-index 1/\xi (hence, the corresponding shape parameter is \xi ). To be specific, the tail distribution is described as

:

\bar{F}(x) = 1 - F(x) = L(x) \cdot x^{-1/\xi}, \,\,\,\,\,\text{for some }\xi>0,\,\,\text{where } L \text{ is a slowly varying function.}

It is of a particular interest in the extreme value theory to estimate the shape parameter \xi, especially when \xi is positive (so called the heavy-tailed distribution).

Let F_u be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions F, and large u, F_u is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate \xi: the GPD plays the key role in POT approach.

A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For 1\leq i \leq n , write X_{(i)} for the i-th largest value of X_1, \cdots, X_n . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [https://books.google.com/books?id=o-clBQAAQBAJ&dq=modeeling+extreme+events+for+insurance&pg=PA1]) based on the k upper order statistics is defined as

:

\widehat{\xi}_{k}^{\text{Hill}} = \widehat{\xi}_{k}^{\text{Hill}}(X_{1:n}) = \frac{1}{k-1} \sum_{j=1}^{k-1} \log \bigg(\frac{X_{(j)}}{X_{(k)}} \bigg), \,\,\,\,\,\,\,\, \text{for } 2 \leq k \leq n.

In practice, the Hill estimator is used as follows. First, calculate the estimator \widehat{\xi}_{k}^{\text{Hill}} at each integer k \in \{ 2, \cdots, n\}, and then plot the ordered pairs \{(k,\widehat{\xi}_{k}^{\text{Hill}})\}_{k=2}^{n}. Then, select from the set of Hill estimators \{\widehat{\xi}_{k}^{\text{Hill}}\}_{k=2}^{n} which are roughly constant with respect to k: these stable values are regarded as reasonable estimates for the shape parameter \xi. If X_1, \cdots, X_n are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter \xi [https://www.jstor.org/stable/1427870].

Note that the Hill estimator \widehat{\xi}_{k}^{\text{Hill}} makes a use of the log-transformation for the observations X_{1:n} = (X_1, \cdots, X_n) . (The Pickand's estimator \widehat{\xi}_{k}^{\text{Pickand}} also employed the log-transformation, but in a slightly different way

[https://www.jstor.org/stable/2242785].)

See also

References

{{Reflist|refs}}

Further reading

  • {{Cite journal|last=Pickands|first=James|title=Statistical inference using extreme order statistics|journal=Annals of Statistics|volume=3 s|year=1975|pages=119–131|doi=10.1214/aos/1176343003|doi-access=free|url=https://projecteuclid.org/journals/annals-of-statistics/volume-3/issue-1/Statistical-Inference-Using-Extreme-Order-Statistics/10.1214/aos/1176343003.pdf}}
  • {{Cite journal |last1=Balkema |first1=A. |title=Residual life time at great age |journal=Annals of Probability |volume=2 |year=1974 |pages=792–804 |doi=10.1214/aop/1176996548 |first2=Laurens |last2=De Haan |issue=5 |author-link2=Laurens de Haan |doi-access=free }}
  • {{Cite journal |last1=Lee|first1=Seyoon |title = Exponentiated generalized Pareto distribution:Properties and applications towards extreme value theory |journal=Communications in Statistics - Theory and Methods|year=2018|pages=1–25 |doi=10.1080/03610926.2018.1441418 |first2=J.H.K. |last2=Kim|volume=48 |issue=8 |arxiv=1708.01686 |s2cid=88514574 }}
  • {{cite book|title=Continuous Univariate Distributions Volume 1, second edition|author1=N. L. Johnson |author2=S. Kotz |author3=N. Balakrishnan |publisher=Wiley|location=New York|year=1994|isbn=978-0-471-58495-7}} Chapter 20, Section 12: Generalized Pareto Distributions.
  • {{cite book|editor= Duangkamon Chotikapanich|year=2011|title=Modeling Distributions and Lorenz Curves|publisher=Springer|location=New York|author=Barry C. Arnold|chapter=Chapter 7: Pareto and Generalized Pareto Distributions|chapter-url=https://books.google.com/books?id=fUJZZLj1kbwC&pg=PA119|isbn= 9780387727967}}
  • {{cite book|author1=Arnold, B. C. |author2=Laguna, L.|year=1977|title= On generalized Pareto distributions with applications to income data|location= Ames, Iowa| publisher=Iowa State University, Department of Economics}}