truncated distribution
{{Short description|Conditional distribution in statistics}}
{{Refimprove|date=September 2009}}
{{Probability distribution
|name =Truncated Distribution
|type =density
|pdf_image = tnormPDF.png
|pdf_caption = Probability density function for the truncated normal distribution for different sets of parameters. In all cases, a = −10 and b = 10. For the black: μ = −8, σ = 2; blue: μ = 0, σ = 2; red: μ = 9, σ = 10; orange: μ = 0, σ = 10.
|support =
|pdf =
|cdf =
|mean =
|median =
|mode =
|variance =
|skewness =
|kurtosis =
|entropy =
|mgf =
|char =
}}
In statistics, a truncated distribution is a conditional distribution that results from restricting the domain of some other probability distribution. Truncated distributions arise in practical statistics in cases where the ability to record, or even to know about, occurrences is limited to values which lie above or below a given threshold or within a specified range. For example, if the dates of birth of children in a school are examined, these would typically be subject to truncation relative to those of all children in the area given that the school accepts only children in a given age range on a specific date. There would be no information about how many children in the locality had dates of birth before or after the school's cutoff dates if only a direct approach to the school were used to obtain information.
Where sampling is such as to retain knowledge of items that fall outside the required range, without recording the actual values, this is known as censoring, as opposed to the truncation here.Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms. OUP. {{ISBN|0-19-920613-9}}
Definition
The following discussion{{cite book |last1=Kendall |first1=Maurice G. |last2=Stuart |first2=Alan |title=The Advanced Theory of Statistics Volume 2: Inference and Relationship |location=London |publisher=Charles Griffin and Company Ltd |edition=2nd |year=1967 }}, Section (32.17) is in terms of a random variable having a continuous distribution although the same ideas apply to discrete distributions. Similarly, the discussion assumes that truncation is to a semi-open interval y ∈ (a,b] but other possibilities can be handled straightforwardly.
Suppose we have a random variable, that is distributed according to some probability density function, , with cumulative distribution function both of which have infinite support. Suppose we wish to know the probability density of the random variable after restricting the support to be between two constants so that the support, . That is to say, suppose we wish to know how is distributed given .
:
where for all
Notice that in fact
:
Truncated distributions need not have parts removed from the top and bottom. A truncated distribution where just the bottom of the distribution has been removed is as follows:
:
where
A truncated distribution where the top of the distribution has been removed is as follows:
:
where
Expectation of truncated random variable
Suppose we wish to find the expected value of a random variable distributed according to the density
:
where again
Letting
\lim_{y \to a} E(u(X)|X>y) = E(u(X)) \lim_{y \to b} E(u(X)|X>y) = u(b) \frac{\partial}{\partial y}[E(u(X)|X>y)] = \frac{f(y)}{1-F(y)}[E(u(X)|X>y) - u(y)]
: and
\lim_{y \to a}\frac{\partial}{\partial y}[E(u(X)|X>y)] = f(a)[E(u(X)) - u(a)] \lim_{y \to b}\frac{\partial}{\partial y}[E(u(X)|X>y)] = \frac{1}{2}u'(b)
Provided that the limits exist, that is:
Examples
The truncated normal distribution is an important example.Johnson, N.L., Kotz, S., Balakrishnan, N. (1994) Continuous Univariate Distributions, Volume 1, Wiley. {{ISBN|0-471-58495-9}} (Section 10.1) Literature has considered the left-truncated normal distribution,{{Cite journal |last1=del Castillo |first1=Joan |date=March 1994 |title=The singly truncated normal distribution: A non-steep exponential family |url=https://www.ism.ac.jp/editsec/aism/pdf/046_1_0057.pdf|journal= Annals of the Institute of Statistical Mathematics |volume=46 |issue=1 |pages=57–66 |doi=10.1007/BF00773592 }} the left-truncated Weibull distribution{{Cite journal |last1=Wingo |first1=Dallas R. |date=December 1989 |title=The left-truncated Weibull distribution: theory and computation |url=https://link.springer.com/article/10.1007/BF02924307|journal= Statistical Papers |volume=30 |pages=39–48 |doi=10.1007/BF02924307}}{{Cite journal |last1=Kizilersu |first1=Ayse |last2=Kreer |first2=Markus |last3=Thomas |first3=Anthony W. |date= June 2016 |title=Goodness-of-fit Testing for Left-truncated Two-parameter Weibull Distributions with Known Truncation point|url=https://www.ajs.or.at/index.php/ajs/article/view/doi%3A10.17713ajs.v45i3.106|journal= Austrian Journal of Statistics |volume=45 |issue=3 |pages=15–42 |doi=10.17713/ajs.v45i3.106|hdl=2440/113666 |hdl-access=free }}
and the left-truncated log-logistic distribution.{{Cite journal |last1=Kreer |first1=Markus|last2=Kizilersu |first2=Ayse |last3=Guscott|first3=Jake |last4=Schmitz |first4= Lukas Christopher|last5=Thomas |first5=Anthony W. |date= September 2024 |title=Maximum likelihood estimation for left-truncated log-logistic distributions with a given truncation point | url=https://link.springer.com/article/10.1007/s00362-024-01603-8 | journal= Stastical Papers |volume=65 |pages=5409–5445 |doi=10.1007/s00362-024-01603-8 |doi-access=free|arxiv=2210.15155 }}
The Tobit model employs truncated distributions.
Other examples include truncated binomial at x=0 and truncated poisson at x=0.
Random truncation
Suppose we have the following set up: a truncation value,
First, by definition:
:
:
Notice that
By Bayes' rule,
:
which expands to
:
= Two uniform distributions (example) =
Suppose we know that t is uniformly distributed from [0,T] and x|t is distributed uniformly on [0,t]. Let g(t) and f(x|t) be the densities that describe t and x respectively. Suppose we observe a value of x and wish to know the distribution of t given that value of x.
: