centering matrix#Properties
{{Short description|Kind of matrix}}
{{More citations needed|date=August 2024}}
In mathematics and multivariate statistics, the centering matrixJohn I. Marden, Analyzing and Modeling Rank Data, Chapman & Hall, 1995, {{ISBN|0-412-99521-2}}, page 59. is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component of that vector.
Definition
The centering matrix of size n is defined as the n-by-n matrix
:
where is the identity matrix of size n and is an n-by-n matrix of all 1's.
For example
:
0 \end{bmatrix}
,
:
1 & 0 \\
0 & 1
\end{array} \right] - \frac{1}{2}\left[ \begin{array}{rrr}
1 & 1 \\
1 & 1
\end{array} \right] = \left[ \begin{array}{rrr}
\frac{1}{2} & -\frac{1}{2} \\
-\frac{1}{2} & \frac{1}{2}
\end{array} \right]
,
:
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{array} \right] - \frac{1}{3}\left[ \begin{array}{rrr}
1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1
\end{array} \right]
= \left[ \begin{array}{rrr}
\frac{2}{3} & -\frac{1}{3} & -\frac{1}{3} \\
-\frac{1}{3} & \frac{2}{3} & -\frac{1}{3} \\
-\frac{1}{3} & -\frac{1}{3} & \frac{2}{3}
\end{array} \right]
Properties
Given a column-vector, of size n, the centering property of can be expressed as
:
where is a column vector of ones and is the mean of the components of .
is symmetric positive semi-definite.
is idempotent, so that , for . Once the mean has been removed, it is zero and removing it again has no effect.
is singular. The effects of applying the transformation cannot be reversed.
has the eigenvalue 1 of multiplicity n − 1 and eigenvalue 0 of multiplicity 1.
has a nullspace of dimension 1, along the vector .
is an orthogonal projection matrix. That is, is a projection of onto the (n − 1)-dimensional subspace that is orthogonal to the nullspace . (This is the subspace of all n-vectors whose components sum to zero.)
The trace of is .
Application
Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it is a convenient analytical tool. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of an m-by-n matrix .
The left multiplication by subtracts a corresponding mean value from each of the n columns, so that each column of the product has a zero mean. Similarly, the multiplication by on the right subtracts a corresponding mean value from each of the m rows, and each row of the product has a zero mean.
The multiplication on both sides creates a doubly centred matrix , whose row and column means are equal to zero.
The centering matrix provides in particular a succinct way to express the scatter matrix, of a data sample , where is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as
:
is the covariance matrix of the multinomial distribution, in the special case where the parameters of that distribution are , and .