Quadratic form (statistics)

{{More citations needed|date=December 2009}}

In multivariate statistics, if \varepsilon is a vector of n random variables, and \Lambda is an n-dimensional symmetric matrix, then the scalar quantity \varepsilon^T\Lambda\varepsilon is known as a quadratic form in \varepsilon.

Expectation

It can be shown that{{cite web|last=Bates|first=Douglas|title=Quadratic Forms of Random Variables|url=http://www.stat.wisc.edu/~st849-1/lectures/Ch02.pdf|work=STAT 849 lectures|access-date=August 21, 2011}}

:\operatorname{E}\left[\varepsilon^T\Lambda\varepsilon\right]=\operatorname{tr}\left[\Lambda \Sigma\right] + \mu^T\Lambda\mu

where \mu and \Sigma are the expected value and variance-covariance matrix of \varepsilon, respectively, and tr denotes the trace of a matrix. This result only depends on the existence of \mu and \Sigma; in particular, normality of \varepsilon is not required.

A book treatment of the topic of quadratic forms in random variables is that of Mathai and Provost.{{cite book | title=Quadratic Forms in Random Variables | publisher=CRC Press |author1=Mathai, A. M. |author2=Provost, Serge B. |name-list-style=amp | year=1992 | page=424 | isbn=978-0824786915}}

= Proof =

Since the quadratic form is a scalar quantity, \varepsilon^T\Lambda\varepsilon = \operatorname{tr}(\varepsilon^T\Lambda\varepsilon).

Next, by the cyclic property of the trace operator,

: \operatorname{E}[\operatorname{tr}(\varepsilon^T\Lambda\varepsilon)] = \operatorname{E}[\operatorname{tr}(\Lambda\varepsilon\varepsilon^T)].

Since the trace operator is a linear combination of the components of the matrix, it therefore follows from the linearity of the expectation operator that

: \operatorname{E}[\operatorname{tr}(\Lambda\varepsilon\varepsilon^T)] = \operatorname{tr}(\Lambda \operatorname{E}(\varepsilon\varepsilon^T)).

A standard property of variances then tells us that this is

: \operatorname{tr}(\Lambda (\Sigma + \mu \mu^T)).

Applying the cyclic property of the trace operator again, we get

: \operatorname{tr}(\Lambda\Sigma) + \operatorname{tr}(\Lambda \mu \mu^T) = \operatorname{tr}(\Lambda\Sigma) + \operatorname{tr}(\mu^T\Lambda\mu) = \operatorname{tr}(\Lambda\Sigma) + \mu^T\Lambda\mu.

Variance in the Gaussian case

In general, the variance of a quadratic form depends greatly on the distribution of \varepsilon. However, if \varepsilon does follow a multivariate normal distribution, the variance of the quadratic form becomes particularly tractable. Assume for the moment that \Lambda is a symmetric matrix. Then,

:\operatorname{var} \left[\varepsilon^T\Lambda\varepsilon\right] = 2\operatorname{tr}\left[\Lambda \Sigma\Lambda \Sigma\right] + 4\mu^T\Lambda\Sigma\Lambda\mu.{{Cite book |title=Linear models in statistics |last=Rencher |first=Alvin C. |date=2008 |publisher=Wiley-Interscience |last2=Schaalje |first2=G. Bruce. |isbn=9780471754985 |edition=2nd |location=Hoboken, N.J. |oclc=212120778}}

In fact, this can be generalized to find the covariance between two quadratic forms on the same \varepsilon (once again, \Lambda_1 and \Lambda_2 must both be symmetric):

:\operatorname{cov}\left[\varepsilon^T\Lambda_1\varepsilon,\varepsilon^T\Lambda_2\varepsilon\right]=2\operatorname{tr}\left[\Lambda _1\Sigma\Lambda_2 \Sigma\right] + 4\mu^T\Lambda_1\Sigma\Lambda_2\mu.{{cite book |last1=Graybill |first1=Franklin A. |title=Matrices with applications in statistics |publisher=Belmont, Calif. |location=Wadsworth |isbn=0534980384 |page=367 |edition=2.}}

In addition, a quadratic form such as this follows a generalized chi-squared distribution.

=Computing the variance in the non-symmetric case=

The case for general \Lambda can be derived by noting that

:\varepsilon^T\Lambda^T\varepsilon=\varepsilon^T\Lambda\varepsilon

so

:\varepsilon^T\tilde{\Lambda}\varepsilon=\varepsilon^T\left(\Lambda+\Lambda^T\right)\varepsilon/2

is a quadratic form in the symmetric matrix \tilde{\Lambda}=\left(\Lambda+\Lambda^T\right)/2, so the mean and variance expressions are the same, provided \Lambda is replaced by \tilde{\Lambda} therein.

Examples of quadratic forms

In the setting where one has a set of observations y and an operator matrix H, then the residual sum of squares can be written as a quadratic form in y:

:\textrm{RSS}=y^T(I-H)^T (I-H)y.

For procedures where the matrix H is symmetric and idempotent, and the errors are Gaussian with covariance matrix \sigma^2I, \textrm{RSS}/\sigma^2 has a chi-squared distribution with k degrees of freedom and noncentrality parameter \lambda, where

:k=\operatorname{tr}\left[(I-H)^T(I-H)\right]

:\lambda=\mu^T(I-H)^T(I-H)\mu/2

may be found by matching the first two central moments of a noncentral chi-squared random variable to the expressions given in the first two sections. If Hy estimates \mu with no bias, then the noncentrality \lambda is zero and \textrm{RSS}/\sigma^2 follows a central chi-squared distribution.

See also

References