Scatter matrix

{{Short description|Concept in probability theory}}

: For the notion in quantum mechanics, see scattering matrix.

In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix, for instance of the multivariate normal distribution.

Definition

Given n samples of m-dimensional data, represented as the m-by-n matrix, X=[\mathbf{x}_1,\mathbf{x}_2,\ldots,\mathbf{x}_n], the sample mean is

:\overline{\mathbf{x}} = \frac{1}{n}\sum_{j=1}^n \mathbf{x}_j

where \mathbf{x}_j is the j-th column of X.{{Cite web |last=Raghavan |date=2018-08-16 |title=Scatter matrix, Covariance and Correlation Explained |url=https://medium.com/@raghavan99o/scatter-matrix-covariance-and-correlation-explained-14921741ca56 |access-date=2022-12-28 |website=Medium |language=en}}

The scatter matrix is the m-by-m positive semi-definite matrix

:S = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})(\mathbf{x}_j-\overline{\mathbf{x}})^T = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})\otimes(\mathbf{x}_j-\overline{\mathbf{x}}) = \left( \sum_{j=1}^n \mathbf{x}_j \mathbf{x}_j^T \right) - n \overline{\mathbf{x}} \overline{\mathbf{x}}^T

where (\cdot)^T denotes matrix transpose,{{Cite web |last=Raghavan |date=2018-08-16 |title=Scatter matrix, Covariance and Correlation Explained |url=https://medium.com/@raghavan99o/scatter-matrix-covariance-and-correlation-explained-14921741ca56 |access-date=2022-12-28 |website=Medium |language=en}} and multiplication is with regards to the outer product. The scatter matrix may be expressed more succinctly as

:S = X\,C_n\,X^T

where \,C_n is the n-by-n centering matrix.

Application

The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix

:C_{ML}=\frac{1}{n}S.{{cite thesis |last=Liu |first=Zhedong |date=April 2019 |title=Robust Estimation of Scatter Matrix, Random Matrix Theory and an Application to Spectrum Sensing |url=https://repository.kaust.edu.sa/bitstream/handle/10754/652444/Thesis.pdf?sequence=1&isAllowed=y |type=Master of Science |publisher=King Abdullah University of Science and Technology}}

When the columns of X are independently sampled from a multivariate normal distribution, then S has a Wishart distribution.

See also

References

{{reflist}}

{{DEFAULTSORT:Scatter Matrix}}

Category:Covariance and correlation

Category:Matrices (mathematics)

{{Statistics-stub}}

{{matrix-stub}}