Control variates

{{Short description|Technique for increasing the precision of estimates in Monte Carlo experiments}}

The control variates method is a variance reduction technique used in Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.{{cite journal|last1= Lemieux |first1=C.|title=Control Variates|journal= Wiley StatsRef: Statistics Reference Online|date=2017|pages=1–8|doi= 10.1002/9781118445112.stat07947 |isbn=9781118445112 }}

Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. New York: Springer. {{ISBN|0-387-00451-3}} (p. 185){{cite journal|last1=Botev|first1=Z.|last2=Ridder|first2=A.|title=Variance Reduction|journal= Wiley StatsRef: Statistics Reference Online|date=2017|pages=1–6|doi=10.1002/9781118445112.stat07975|isbn=9781118445112 |hdl=1959.4/unsworks_50616|hdl-access=free}}

Underlying principle

Let the unknown parameter of interest be $\mu$ , and assume we have a statistic $m$ such that the expected value of m is μ: $\mathbb{E}\left[m\right]=\mu$ , i.e. m is an unbiased estimator for μ. Suppose we calculate another statistic $t$ such that $\mathbb{E}\left[t\right]=\tau$ is a known value. Then

: $m^\star = m + c\left(t-\tau\right) \,$

is also an unbiased estimator for $\mu$ for any choice of the coefficient $c$ .

The variance of the resulting estimator $m^{\star}$ is

: $\textrm{Var}\left(m^{\star}\right)=\textrm{Var}\left(m\right) + c^2\,\textrm{Var}\left(t\right) + 2c\,\textrm{Cov}\left(m,t\right).$

By differentiating the above expression with respect to $c$ , it can be shown that choosing the optimal coefficient

: $c^\star = - \frac{\textrm{Cov}\left(m,t\right)}{\textrm{Var}\left(t\right)}$

minimizes the variance of $m^{\star}$ . (Note that this coefficient is the same as the coefficient obtained from a linear regression.) With this choice,

: $\begin{align}
\textrm{Var}\left(m^{\star}\right) & =\textrm{Var}\left(m\right) - \frac{\left[\textrm{Cov}\left(m,t\right)\right]^2}{\textrm{Var}\left(t\right)} \\
& = \left(1-\rho_{m,t}^2\right)\textrm{Var}\left(m\right)
\end{align}$

where

: $\rho_{m,t}=\textrm{Corr}\left(m,t\right) \,$

is the correlation coefficient of $m$ and $t$ . The greater the value of $\vert\rho_{m,t}\vert$ , the greater the variance reduction achieved.

In the case that $\textrm{Cov}\left(m,t\right)$ , $\textrm{Var}\left(t\right)$ , and/or $\rho_{m,t}\;$ are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain least squares system; therefore this technique is also known as regression sampling.

When the expectation of the control variable, $\mathbb{E}\left[t\right]=\tau$ , is not known analytically, it is still possible to increase the precision in estimating $\mu$ (for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating $t$ is significantly cheaper than computing $m$ ; 2) the magnitude of the correlation coefficient $|\rho_{m,t}|$ is close to unity.

Example

We would like to estimate

: $I = \int_0^1 \frac{1}{1+x} \, \mathrm{d}x$

using Monte Carlo integration. This integral is the expected value of $f(U)$ , where

: $f(U) = \frac{1}{1+U}$

and U follows a uniform distribution [0, 1].

Using a sample of size n denote the points in the sample as $u_1, \cdots, u_n$ . Then the estimate is given by

: $I \approx \frac{1}{n} \sum_i f(u_i).$

Now we introduce $g(U) = 1+U$ as a control variate with a known expected value $\mathbb{E}\left[g\left(U\right)\right]=\int_0^1 (1+x) \, \mathrm{d}x=\tfrac{3}{2}$ and combine the two into a new estimate

: $I \approx \frac{1}{n} \sum_i f(u_i)+c\left(\frac{1}{n}\sum_i g(u_i) -3/2\right).$

Using $n=1500$ realizations and an estimated optimal coefficient $c^\star \approx 0.4773$ we obtain the following results

class="wikitable"

| align="right" | Estimate

| align="right" | Variance

Classical estimate

| align="right" | 0.69475

| align="right" | 0.01947

Control variates

| align="right" | 0.69295

| align="right" | 0.00060

The variance was significantly reduced after using the control variates technique. (The exact result is $I=\ln 2 \approx 0.69314718$ .)

Notes

References

Ross, Sheldon M. (2002) Simulation 3rd edition {{ISBN|978-0-12-598053-1}}
Averill M. Law & W. David Kelton (2000), Simulation Modeling and Analysis, 3rd edition. {{ISBN|0-07-116537-1}}
S. P. Meyn (2007) Control Techniques for Complex Networks, Cambridge University Press. {{ISBN|978-0-521-88441-9}}. [https://web.archive.org/web/20100619011046/https://netfiles.uiuc.edu/meyn/www/spm_files/CTCN/CTCN.html Downloadable draft] (Section 11.4: Control variates and shadow functions)

Category:Monte Carlo methods

Category:Statistical randomness

Category:Computational statistics

Category:Variance reduction

Control variates

Underlying principle

Example

See also

Notes

References