Normal-inverse-Wishart distribution

{{Short description|Multivariate parameter family of continuous probability distributions}}

{{Probability distribution |

name =normal-inverse-Wishart|

type =density|

pdf_image =|

cdf_image =|

notation =(\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm{NIW}(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu)|

parameters =\boldsymbol\mu_0\in\mathbb{R}^D\, location (vector of real)
\lambda > 0\, (real)
\boldsymbol\Psi \in\mathbb{R}^{D\times D} inverse scale matrix (pos. def.)
\nu > D-1\, (real)|

support =\boldsymbol\mu\in\mathbb{R}^D ; \boldsymbol\Sigma \in\mathbb{R}^{D\times D} covariance matrix (pos. def.)|

pdf =f(\boldsymbol\mu,\boldsymbol\Sigma|\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu) = \mathcal{N}(\boldsymbol\mu|\boldsymbol\mu_0,\tfrac{1}{\lambda}\boldsymbol\Sigma)\ \mathcal{W}^{-1}(\boldsymbol\Sigma|\boldsymbol\Psi,\nu)|

cdf =|

mean =|

median =|

mode =|

variance =|

skewness =|

kurtosis =|

entropy =|

mgf =|

char =|

}}

In probability theory and statistics, the normal-inverse-Wishart distribution (or Gaussian-inverse-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with an unknown mean and covariance matrix (the inverse of the precision matrix).Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution." [http://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf]

Definition

Suppose

: \boldsymbol\mu|\boldsymbol\mu_0,\lambda,\boldsymbol\Sigma \sim \mathcal{N}\left(\boldsymbol\mu\Big|\boldsymbol\mu_0,\frac{1}{\lambda}\boldsymbol\Sigma\right)

has a multivariate normal distribution with mean \boldsymbol\mu_0 and covariance matrix \tfrac{1}{\lambda}\boldsymbol\Sigma, where

:\boldsymbol\Sigma|\boldsymbol\Psi,\nu \sim \mathcal{W}^{-1}(\boldsymbol\Sigma|\boldsymbol\Psi,\nu)

has an inverse Wishart distribution. Then (\boldsymbol\mu,\boldsymbol\Sigma)

has a normal-inverse-Wishart distribution, denoted as

: (\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm{NIW}(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu) .

Characterization

=Probability density function=

: f(\boldsymbol\mu,\boldsymbol\Sigma|\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu) = \mathcal{N}\left(\boldsymbol\mu\Big|\boldsymbol\mu_0,\frac{1}{\lambda}\boldsymbol\Sigma\right) \mathcal{W}^{-1}(\boldsymbol\Sigma|\boldsymbol\Psi,\nu)

The full version of the PDF is as follows:Simon J.D. Prince(June 2012). [http://www.computervisionmodels.com/ Computer Vision: Models, Learning, and Inference]. Cambridge University Press. 3.8: "Normal inverse Wishart distribution".

f(\boldsymbol{\mu},\boldsymbol{\Sigma} |

\boldsymbol{\mu}_0,\lambda,\boldsymbol{\Psi},\nu )

=\frac{\lambda^{D/2}|\boldsymbol{\Psi}|^{\nu /

2}|\boldsymbol{\Sigma}|^{-\frac{\nu + D + 2}{2}}}{(2

\pi)^{D/2}2^{\frac{\nu

D}{2}}\Gamma_D(\frac{\nu}{2})}\text{exp}\left\{

-\frac{1}{2}Tr(\boldsymbol{\Psi

\Sigma}^{-1})-\frac{\lambda}{2}(\boldsymbol{\mu}-\boldsymbol{\mu}_0)^T\boldsymbol{\Sigma}^{-1}(\boldsymbol{\mu}

- \boldsymbol{\mu}_0) \right\}

Here \Gamma_D[\cdot] is the multivariate gamma function and Tr(\boldsymbol{\Psi}) is the Trace of the given matrix.

Properties

=Scaling=

=Marginal distributions=

By construction, the marginal distribution over \boldsymbol\Sigma is an inverse Wishart distribution, and the conditional distribution over \boldsymbol\mu given \boldsymbol\Sigma is a multivariate normal distribution. The marginal distribution over \boldsymbol\mu is a multivariate t-distribution.

Posterior distribution of the parameters

Suppose the sampling density is a multivariate normal distribution

:\boldsymbol{y_i}|\boldsymbol\mu,\boldsymbol\Sigma \sim \mathcal{N}_p(\boldsymbol\mu,\boldsymbol\Sigma)

where \boldsymbol{y} is an n\times p matrix and \boldsymbol{y_i} (of length p) is row i of the matrix .

With the mean and covariance matrix of the sampling distribution is unknown, we can place a Normal-Inverse-Wishart prior on the mean and covariance parameters jointly

:

(\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm{NIW}(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu).

The resulting posterior distribution for the mean and covariance matrix will also be a Normal-Inverse-Wishart

:

(\boldsymbol\mu,\boldsymbol\Sigma|y) \sim \mathrm{NIW}(\boldsymbol\mu_n,\lambda_n,\boldsymbol\Psi_n,\nu_n),

where

:

\boldsymbol\mu_n = \frac{\lambda\boldsymbol\mu_0 + n \bar{\boldsymbol y}}{\lambda+n}

:

\lambda_n = \lambda + n

:

\nu_n = \nu + n

:

\boldsymbol\Psi_n = \boldsymbol{\Psi + S} +\frac{\lambda n}{\lambda+n}

(\boldsymbol{\bar{y}-\mu_0})(\boldsymbol{\bar{y}-\mu_0})^T

~~~\mathrm{ with }~~\boldsymbol{S}= \sum_{i=1}^{n} (\boldsymbol{y_i-\bar{y}})(\boldsymbol{y_i-\bar{y}})^T

.

To sample from the joint posterior of (\boldsymbol\mu,\boldsymbol\Sigma), one simply draws samples from \boldsymbol\Sigma|\boldsymbol y \sim \mathcal{W}^{-1}(\boldsymbol\Psi_n,\nu_n), then draw \boldsymbol\mu | \boldsymbol{\Sigma,y} \sim \mathcal{N}_p(\boldsymbol\mu_n,\boldsymbol\Sigma/\lambda_n). To draw from the posterior predictive of a new observation, draw \boldsymbol\tilde{y}|\boldsymbol{\mu,\Sigma,y} \sim \mathcal{N}_p(\boldsymbol\mu,\boldsymbol\Sigma) , given the already drawn values of \boldsymbol\mu and \boldsymbol\Sigma.Gelman, Andrew, et al. Bayesian data analysis. Vol. 2, p.73. Boca Raton, FL, USA: Chapman & Hall/CRC, 2014.

Generating normal-inverse-Wishart random variates

Generation of random variates is straightforward:

  1. Sample \boldsymbol\Sigma from an inverse Wishart distribution with parameters \boldsymbol\Psi and \nu
  2. Sample \boldsymbol\mu from a multivariate normal distribution with mean \boldsymbol\mu_0 and variance \boldsymbol \tfrac{1}{\lambda} \boldsymbol\Sigma

Related distributions

Notes

{{reflist}}

References

  • Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media.
  • Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution." [http://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf]

{{ProbDistributions|multivariate}}

Category:Multivariate continuous distributions

Category:Conjugate prior distributions

Category:Normal distribution