Control variates

{{Short description|Technique for increasing the precision of estimates in Monte Carlo experiments}}

The control variates method is a variance reduction technique used in Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.{{cite journal|last1= Lemieux |first1=C.|title=Control Variates|journal= Wiley StatsRef: Statistics Reference Online|date=2017|pages=1–8|doi= 10.1002/9781118445112.stat07947 |isbn=9781118445112 }}

Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. New York: Springer. {{ISBN|0-387-00451-3}} (p. 185){{cite journal|last1=Botev|first1=Z.|last2=Ridder|first2=A.|title=Variance Reduction|journal= Wiley StatsRef: Statistics Reference Online|date=2017|pages=1–6|doi=10.1002/9781118445112.stat07975|isbn=9781118445112 |hdl=1959.4/unsworks_50616|hdl-access=free}}

Underlying principle

Let the unknown parameter of interest be \mu, and assume we have a statistic m such that the expected value of m is μ: \mathbb{E}\left[m\right]=\mu, i.e. m is an unbiased estimator for μ. Suppose we calculate another statistic t such that \mathbb{E}\left[t\right]=\tau is a known value. Then

:m^\star = m + c\left(t-\tau\right) \,

is also an unbiased estimator for \mu for any choice of the coefficient c.

The variance of the resulting estimator m^{\star} is

:\textrm{Var}\left(m^{\star}\right)=\textrm{Var}\left(m\right) + c^2\,\textrm{Var}\left(t\right) + 2c\,\textrm{Cov}\left(m,t\right).

By differentiating the above expression with respect to c, it can be shown that choosing the optimal coefficient

:c^\star = - \frac{\textrm{Cov}\left(m,t\right)}{\textrm{Var}\left(t\right)}

minimizes the variance of m^{\star}. (Note that this coefficient is the same as the coefficient obtained from a linear regression.) With this choice,

:\begin{align}

\textrm{Var}\left(m^{\star}\right) & =\textrm{Var}\left(m\right) - \frac{\left[\textrm{Cov}\left(m,t\right)\right]^2}{\textrm{Var}\left(t\right)} \\

& = \left(1-\rho_{m,t}^2\right)\textrm{Var}\left(m\right)

\end{align}

where

:\rho_{m,t}=\textrm{Corr}\left(m,t\right) \,

is the correlation coefficient of m and t. The greater the value of \vert\rho_{m,t}\vert, the greater the variance reduction achieved.

In the case that \textrm{Cov}\left(m,t\right), \textrm{Var}\left(t\right), and/or \rho_{m,t}\; are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain least squares system; therefore this technique is also known as regression sampling.

When the expectation of the control variable, \mathbb{E}\left[t\right]=\tau, is not known analytically, it is still possible to increase the precision in estimating \mu (for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating t is significantly cheaper than computing m; 2) the magnitude of the correlation coefficient |\rho_{m,t}| is close to unity.

Example

We would like to estimate

:I = \int_0^1 \frac{1}{1+x} \, \mathrm{d}x

using Monte Carlo integration. This integral is the expected value of f(U), where

:f(U) = \frac{1}{1+U}

and U follows a uniform distribution [0, 1].

Using a sample of size n denote the points in the sample as u_1, \cdots, u_n. Then the estimate is given by

:I \approx \frac{1}{n} \sum_i f(u_i).

Now we introduce g(U) = 1+U as a control variate with a known expected value \mathbb{E}\left[g\left(U\right)\right]=\int_0^1 (1+x) \, \mathrm{d}x=\tfrac{3}{2} and combine the two into a new estimate

:I \approx \frac{1}{n} \sum_i f(u_i)+c\left(\frac{1}{n}\sum_i g(u_i) -3/2\right).

Using n=1500 realizations and an estimated optimal coefficient c^\star \approx 0.4773 we obtain the following results

class="wikitable"

|

| align="right" | Estimate

| align="right" | Variance

Classical estimate

| align="right" | 0.69475

| align="right" | 0.01947

Control variates

| align="right" | 0.69295

| align="right" | 0.00060

The variance was significantly reduced after using the control variates technique. (The exact result is I=\ln 2 \approx 0.69314718.)

See also

{{refimprove|date=August 2011}}

Notes

References

  • Ross, Sheldon M. (2002) Simulation 3rd edition {{ISBN|978-0-12-598053-1}}
  • Averill M. Law & W. David Kelton (2000), Simulation Modeling and Analysis, 3rd edition. {{ISBN|0-07-116537-1}}
  • S. P. Meyn (2007) Control Techniques for Complex Networks, Cambridge University Press. {{ISBN|978-0-521-88441-9}}. [https://web.archive.org/web/20100619011046/https://netfiles.uiuc.edu/meyn/www/spm_files/CTCN/CTCN.html Downloadable draft] (Section 11.4: Control variates and shadow functions)

Category:Monte Carlo methods

Category:Statistical randomness

Category:Computational statistics

Category:Variance reduction