matrix t-distribution

{{Probability distribution|

name =Matrix t|

type =density|

pdf_image =|

cdf_image =|

notation = ${\rm T}_{n,p}(\nu,\mathbf{M},\boldsymbol\Sigma, \boldsymbol\Omega)$ |

parameters =

$\mathbf{M}$ location (real $n\times p$ matrix)

$\boldsymbol\Omega$ scale (positive-definite real $n\times n$ matrix)

$\boldsymbol\Sigma$ scale (positive-definite real $p\times p$ matrix)

$\nu>0$ degrees of freedom (real)|

support = $\mathbf{X} \in\mathbb{R}^{n\times p}$ |

pdf = $\frac{\Gamma_p\left(\frac{\nu+n+p-1}{2}\right)}{(\pi)^\frac{np}{2} \Gamma_p\left(\frac{\nu+p-1}{2}\right)} |\boldsymbol\Omega|^{-\frac{n}{2}} |\boldsymbol\Sigma|^{-\frac{p}{2}}$

: $\times \left|\mathbf{I}_p + \boldsymbol\Sigma^{-1}(\mathbf{X} - \mathbf{M})\boldsymbol\Omega^{-1}(\mathbf{X}-\mathbf{M})^{\rm T}\right|^{-\frac{\nu+n+p-1}{2}}$

cdf =No analytic expression|

mean = $\mathbf{M}$ if $\nu > 1$ , else undefined|

mode = $\mathbf{M}$ |

variance = $\mathrm{cov}(\mathrm{vec}(\mathbf{X}))=\frac{\boldsymbol\Sigma \otimes \boldsymbol\Omega}{\nu-2}$ if $\nu > 2$ , else undefined|

kurtosis =|

entropy =|

mgf =|

char =see below|

}}

In statistics, the matrix t-distribution (or matrix variate t-distribution) is the generalization of the multivariate t-distribution from vectors to matrices.Zhu, Shenghuo and Kai Yu and Yihong Gong (2007). [https://proceedings.neurips.cc/paper_files/paper/2007/file/061412e4a03c02f9902576ec55ebbe77-Paper.pdf "Predictive Matrix-Variate t Models."] In J. C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, NIPS '07: Advances in Neural Information Processing Systems 20, pages 1721–1728. MIT Press, Cambridge, MA, 2008. The notation is changed a bit in this article for consistency with the matrix normal distribution article.{{Cite book|last=Gupta, Arjun K and Nagar, Daya K|title=Matrix variate distributions|publisher=CRC Press|year=1999|pages=Chapter 4}}

The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution: If the matrix has only one row, or only one column, the distributions become equivalent to the corresponding (vector-)multivariate distribution. The matrix t-distribution is the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse Wishart distribution placed over either of its covariance matrices, and the multivariate t-distribution can be generated in a similar way.

In a Bayesian analysis of a multivariate linear regression model based on the matrix normal distribution, the matrix t-distribution is the posterior predictive distribution.

Definition

For a matrix t-distribution, the probability density function at the point $\mathbf{X}$ of an $n\times p$ space is

: $f(\mathbf{X} ; \nu,\mathbf{M},\boldsymbol\Sigma, \boldsymbol\Omega) = K
\times \left|\mathbf{I}_n + \boldsymbol\Sigma^{-1}(\mathbf{X} - \mathbf{M})\boldsymbol\Omega^{-1}(\mathbf{X}-\mathbf{M})^{\rm T}\right|^{-\frac{\nu+n+p-1}{2}},$

where the constant of integration K is given by

: $K =
\frac{\Gamma_p\left(\frac{\nu+n+p-1}{2}\right)}{(\pi)^\frac{np}{2} \Gamma_p\left(\frac{\nu+p-1}{2}\right)} |\boldsymbol\Omega|^{-\frac{n}{2}} |\boldsymbol\Sigma|^{-\frac{p}{2}}.$

Here $\Gamma_p$ is the multivariate gamma function.

Properties

If $\mathbf{X} \sim \mathcal{T}_{n\times p}(\nu, \mathbf{M}, \mathbf{\Sigma}, \mathbf{\Omega})$ , then we have the following properties:

=Expected values=

The mean, or expected value is, if $\nu > 1$ :

: $E[\mathbf{X}] = \mathbf{M}$

and we have the following second-order expectations, if $\nu > 2$ :

: $E[(\mathbf{X} - \mathbf{M})(\mathbf{X} - \mathbf{M})^{T}]
= \frac{\mathbf{\Sigma}\operatorname{tr}(\mathbf{\Omega})}{\nu-2}$

: $E[(\mathbf{X} - \mathbf{M})^{T} (\mathbf{X} - \mathbf{M})]
= \frac{\mathbf{\Omega}\operatorname{tr}(\mathbf{\Sigma}) }{\nu-2}$

where $\operatorname{tr}$ denotes trace.

More generally, for appropriately dimensioned matrices A,B,C:

: $\begin{align}
E[(\mathbf{X}- \mathbf{M})\mathbf{A}(\mathbf{X}- \mathbf{M})^{T}]
&= \frac{\mathbf{\Sigma}\operatorname{tr}(\mathbf{A}^T\mathbf{\Omega})}{\nu - 2} \\
E[(\mathbf{X}- \mathbf{M})^T\mathbf{B}(\mathbf{X}- \mathbf{M})]
&= \frac{\mathbf{\Omega}\operatorname{tr}(\mathbf{B}^T \mathbf{\Sigma})}{\nu - 2} \\
E[(\mathbf{X}- \mathbf{M})\mathbf{C}(\mathbf{X}- \mathbf{M})]
&= \frac{\mathbf{\Sigma}\mathbf{C}^T\mathbf{\Omega}}{\nu - 2}
\end{align}$

=Transformation=

Transpose transform:

: $\mathbf{X}^T \sim\mathcal{T}_{p\times n}(\nu, \mathbf{M}^T, \mathbf{\Omega}, \mathbf{\Sigma})$

Linear transform: let A (r-by-n), be of full rank r ≤ n and B (p-by-s), be of full rank s ≤ p, then:

: $\mathbf{AXB}\sim \mathcal{T}_{r\times s}(\nu,\mathbf{AMB}, \mathbf{A\Sigma A}^T, \mathbf{B}^T\mathbf{\Omega B} )$

The characteristic function and various other properties can be derived from the re-parameterised formulation (see below).

Re-parameterized matrix ''t''-distribution

{{Probability distribution|

name =Re-parameterized matrix t|

type =density|

pdf_image =|

cdf_image =|

notation = ${\rm T}_{n,p}(\alpha,\beta,\mathbf{M},\boldsymbol\Sigma, \boldsymbol\Omega)$ |

parameters =

$\mathbf{M}$ location (real $n\times p$ matrix)

$\boldsymbol\Omega$ scale (positive-definite real $p\times p$ matrix)

$\boldsymbol\Sigma$ scale (positive-definite real $n\times n$ matrix)

$\alpha > (p-1)/2$ shape parameter

$\beta > 0$ scale parameter |

support = $\mathbf{X} \in\mathbb{R}^{n\times p}$ |

pdf = $\frac{\Gamma_p(\alpha+n/2)}{(2\pi/\beta)^\frac{np}{2} \Gamma_p(\alpha)} |\boldsymbol\Omega|^{-\frac{n}{2}} |\boldsymbol\Sigma|^{-\frac{p}{2}}$

: $\times \left|\mathbf{I}_n + \frac{\beta}{2}\boldsymbol\Sigma^{-1}(\mathbf{X} - \mathbf{M})\boldsymbol\Omega^{-1}(\mathbf{X}-\mathbf{M})^{\rm T}\right|^{-(\alpha+n/2)}$

$\Gamma_p$ is the multivariate gamma function.

cdf =No analytic expression|

mean = $\mathbf{M}$ if $\alpha > p/2$ , else undefined|

median =|

mode =|

variance = $\frac{2(\boldsymbol\Sigma \otimes \boldsymbol\Omega)}{\beta(2\alpha-p-1)}$ if $\alpha > (p+1)/2$ , else undefined|

skewness =|

kurtosis =|

entropy =|

mgf =|

char =see below|

}}

An alternative parameterisation of the matrix t-distribution uses two parameters $\alpha$ and $\beta$ in place of $\nu$ .Iranmanesh, Anis, M. Arashi and S. M. M. Tabatabaey (2010). [http://www.ijmsi.ir/browse.php?a_id=139&slc_lang=en&sid=1&ftxt=1 "On Conditional Applications of Matrix Variate Normal Distribution"]. Iranian Journal of Mathematical Sciences and Informatics, 5:2, pp. 33–43.

This formulation reduces to the standard matrix t-distribution with $\beta=2, \alpha=\frac{\nu+p-1}{2}.$

This formulation of the matrix t-distribution can be derived as the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse multivariate gamma distribution placed over either of its covariance matrices.

=Properties=

If $\mathbf{X} \sim {\rm T}_{n,p}(\alpha,\beta,\mathbf{M},\boldsymbol\Sigma, \boldsymbol\Omega)$ then

: $\mathbf{X}^{\rm T} \sim {\rm T}_{p,n}(\alpha,\beta,\mathbf{M}^{\rm T},\boldsymbol\Omega, \boldsymbol\Sigma).$

The property above comes from Sylvester's determinant theorem:

: $\det\left(\mathbf{I}_n + \frac{\beta}{2}\boldsymbol\Sigma^{-1}(\mathbf{X} - \mathbf{M})\boldsymbol\Omega^{-1}(\mathbf{X}-\mathbf{M})^{\rm T}\right) =$

:: $\det\left(\mathbf{I}_p + \frac{\beta}{2}\boldsymbol\Omega^{-1}(\mathbf{X}^{\rm T} - \mathbf{M}^{\rm T})\boldsymbol\Sigma^{-1}(\mathbf{X}^{\rm T}-\mathbf{M}^{\rm T})^{\rm T}\right) .$

If $\mathbf{X} \sim {\rm T}_{n,p}(\alpha,\beta,\mathbf{M},\boldsymbol\Sigma, \boldsymbol\Omega)$ and $\mathbf{A}(n\times n)$ and $\mathbf{B}(p\times p)$ are nonsingular matrices then

: $\mathbf{AXB} \sim {\rm T}_{n,p}(\alpha,\beta,\mathbf{AMB},\mathbf{A}\boldsymbol\Sigma\mathbf{A}^{\rm T}, \mathbf{B}^{\rm T}\boldsymbol\Omega\mathbf{B})
.$

The characteristic function is

: $\phi_T(\mathbf{Z}) = \frac{\exp({\rm tr}(i\mathbf{Z}'\mathbf{M}))|\boldsymbol\Omega|^\alpha}{\Gamma_p(\alpha)(2\beta)^{\alpha p}} |\mathbf{Z}'\boldsymbol\Sigma\mathbf{Z}|^\alpha B_\alpha\left(\frac{1}{2\beta}\mathbf{Z}'\boldsymbol\Sigma\mathbf{Z}\boldsymbol\Omega\right),$

where

: $B_\delta(\mathbf{WZ}) = |\mathbf{W}|^{-\delta} \int_{\mathbf{S}>0} \exp\left({\rm tr}(-\mathbf{SW}-\mathbf{S^{-1}Z})\right)|\mathbf{S}|^{-\delta-\frac12(p+1)}d\mathbf{S},$

and where $B_\delta$ is the type-two Bessel function of Herz{{clarify|date=April 2016}} of a matrix argument.

Notes

External links

[https://github.com/zweng/rmg A C++ library for random matrix generator]

Category:Random matrices

Category:Multivariate continuous distributions