non-uniform random variate generation

{{Short description|Generating pseudo-random numbers that follow a probability distribution}}

Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution.

Methods are typically based on the availability of a uniformly distributed PRN generator. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution.

The first methods were developed for Monte-Carlo simulations in the Manhattan project,{{Citation needed|date=June 2011}} published by John von Neumann in the early 1950s.{{cite book |chapter=Various Techniques Used in Connection with Random Digits |first=John |last=Von Neumann |authorlink=John von Neumann |year=1951 |editor1=Householder, A. S. |editor2=Forsythe, G. E. |editor3=Germond, H. H. |title=Monte Carlo Methods |series=National Bureau of Standards Applied Mathematics Series |volume=12 |pages=36–38 |publisher=US Government Printing Office |chapter-url=https://mcnp.lanl.gov/pdf_files/nbs_vonneumann.pdf |quote=Any one who considers arithmetical methods of producing random digits is of course, in a state of sin.}} Also online is a [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf low-quality scan of the original publication].

Finite discrete distributions

For a discrete probability distribution with a finite number n of indices at which the probability mass function f takes non-zero values, the basic sampling algorithm is straightforward. The interval [0, 1) is divided in n intervals [0, f(1)), [f(1), f(1) + f(2)), ... The width of interval i equals the probability f(i).

One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).

Formalizing this idea becomes easier by using the cumulative distribution function

: $F(i)=\sum_{j=1}^i f(j).$

It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).

This can be done by different algorithms:

Linear search, computational time linear in n.
Binary search, computational time goes with log n.
Indexed search,Ripley (1987) {{Page needed|date=June 2011}} also called the cutpoint method.Fishman (1996) {{Page needed|date=June 2011}}
Alias method, computational time is constant, using some pre-computed tables.
There are other methods that cost constant time.Fishman (1996) {{Page needed|date=June 2011}}

Continuous distributions

Generic methods for generating independent samples:

Rejection sampling for arbitrary density functions
Inverse transform sampling for distributions whose CDF is known
Ratio of uniforms, combining a change of variables and rejection sampling
Slice sampling
Ziggurat algorithm, for monotonically decreasing density functions as well as symmetric unimodal distributions
Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one or more existing sampling methods to generate more involved distributions.

Generic methods for generating correlated samples (often necessary for unusually-shaped or high-dimensional distributions):

Markov chain Monte Carlo, the general principle
Metropolis–Hastings algorithm
Gibbs sampling
Slice sampling
Reversible-jump Markov chain Monte Carlo, when the number of dimensions is not fixed (e.g. when estimating a mixture model and simultaneously estimating the number of mixture components)
Particle filters, when the observed data is connected in a Markov chain and should be processed sequentially

For generating a normal distribution:

For generating a Poisson distribution:

See Poisson distribution#Generating Poisson-distributed random variables

Software libraries

class="wikitable sortable sticky-header-multi" style="width:100%; text-align:left;" \|+ Random distributions provided by software libraries
Library ! Beta ! Binomial ! Cauchy ! Chi-squared ! Dirichlet ! Exponential ! F ! Gamma ! Geometric ! Gumbel ! Hypergeometric ! Laplace ! Logistic ! Log-normal ! Logarithmic ! Multinomial ! Multivariate hypergeometric ! Multivariate normal ! Negative binomial ! Noncentral chi-squared ! Noncentral F ! Normal ! Pareto ! Poisson ! Power ! Rayleigh ! Students's t ! Triangular ! von Mises ! Wald ! Zeta
NumPy \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}}
GNU Scientific Library{{cite web \| title=Random Number Distributions - GSL 2.7 documentation \| website=The GNU Operating System and the Free Software Movement \| url=https://www.gnu.org/software/gsl/doc/html/randist.html \| access-date=2022-08-18}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{No}} \| {{Yes}} \| {{Yes}} \| {{No}} \| {{No}} \| {{Yes}} \| {{Yes}} \| {{Yes}} \| {{dunno}} \| {{Yes}} \| {{Yes}} \| {{No}} \| {{No}} \| {{No}} \| {{No}}

class="wikitable sortable sticky-header-multi" style="width:100%; text-align:left;"

|+ Random distributions provided by software libraries

Library

! Beta

! Cauchy

! F

! Gamma

! Gumbel

! Laplace

! Multivariate hypergeometric

! Multivariate normal

! Negative binomial

! Noncentral chi-squared

! Noncentral F

! Normal

! Pareto

! Poisson

! Power

! Wald

! Zeta

NumPy

| {{Yes}}

GNU Scientific Library{{cite web | title=Random Number Distributions - GSL 2.7 documentation | website=The GNU Operating System and the Free Software Movement | url=https://www.gnu.org/software/gsl/doc/html/randist.html | access-date=2022-08-18}}

| {{Yes}}

| {{No}}

| {{Yes}}

| {{No}}

| {{Yes}}

| {{dunno}}

| {{Yes}}

| {{No}}

Footnotes

Literature

Devroye, L. (1986) [https://www.proquest.com/openview/84d724fc612fbfb1b7c6b7976198e5ff/1?pq-origsite=gscholar&cbl=30748 Non-Uniform Random Variate Generation]. New York: Springer
Fishman, G.S. (1996) [https://books.google.com/books?id=dogHCAAAQBAJ Monte Carlo. Concepts, Algorithms, and Applications]. New York: Springer
Hörmann, W.; J Leydold, G Derflinger (2004,2011) [https://books.google.com/books?id=tiTyCAAAQBAJ Automatic Nonuniform Random Variate Generation]. Berlin: Springer.
Knuth, D.E. (1997) The Art of Computer Programming, Vol. 2 Seminumerical Algorithms, Chapter 3.4.1 (3rd edition).
Ripley, B.D. (1987) Stochastic Simulation. Wiley.

Category:Pseudorandom number generators

Category:Non-uniform random numbers

non-uniform random variate generation

Finite discrete distributions

Continuous distributions

Software libraries

See also

Footnotes

Literature