Tucker decomposition

In mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tucker{{Cite journal

| author = Ledyard R. Tucker

| title = Some mathematical notes on three-mode factor analysis

| journal = Psychometrika

| volume = 31

| issue = 3

| date=September 1966

| doi = 10.1007/BF02289464

| pages = 279–311

| pmid = 5221127

}}

although it goes back to Hitchcock in 1927.{{Cite journal

| author = F. L. Hitchcock

| title = The expression of a tensor or a polyadic as a sum of products

| journal = Journal of Mathematics and Physics

| volume = 6

| pages = 164–189

| year = 1927

| issue = 1–4

| doi = 10.1002/sapm192761164

}}

Initially described as a three-mode extension of factor analysis and principal component analysis it may actually be generalized to higher mode analysis, which is also called higher-order singular value decomposition (HOSVD) or the M-mode SVD. The algorithm to which the literature typically refers when discussing the Tucker decomposition or the HOSVD is the M-mode SVD algorithm introduced by Vasilescu and Terzopoulos,{{Cite conference|author=M. A. O. Vasilescu, D. Terzopoulos (2002)|title="Multilinear Analysis of Image Ensembles: TensorFaces,"|conference= Proc. 7th European Conference on Computer Vision (ECCV'02)|location=Copenhagen, Denmark. |url=https://www.cs.ucla.edu/~maov/tensorfaces/eccv02_corrected.pdf}} but misattributed to Tucker or De Lathauwer etal.

It may be regarded as a more flexible PARAFAC (parallel factor analysis) model. In PARAFAC the core tensor is restricted to be "diagonal".

In practice, Tucker decomposition is used as a modelling tool. For instance, it is used to model three-way (or higher way) data by means of relatively small numbers of components for each of the three or more modes, and the components are linked to each other by a three- (or higher-) way core array. The model parameters are estimated in such a way that, given fixed numbers of components, the modelled data optimally resemble the actual data in the least squares sense. The model gives a summary of the information in the data, in the same way as principal components analysis does for two-way data.

For a 3rd-order tensor $T \in F^{n_{1} \times n_{2} \times n_{3}}$ , where $F$ is either $\mathbb{R}$ or $\mathbb{C}$ , Tucker Decomposition can be denoted as follows,

$T = \mathcal{T} \times_{1} U^{(1)} \times_{2} U^{(2)} \times_{3} U^{(3)}$

where $\mathcal{T} \in F^{d_{1} \times d_{2} \times d_{3}}$ is the core tensor, a 3rd-order tensor that contains the 1-mode, 2-mode and 3-mode singular values of $T$ , which are defined as the Frobenius norm of the 1-mode, 2-mode and 3-mode slices of tensor $\mathcal{T}$ respectively. $U^{(1)}, U^{(2)}, U^{(3)}$ are unitary matrices in $F^{d_{1} \times n_{1}}, F^{d_{2} \times n_{2}}, F^{d_{3} \times n_{3}}$ respectively. The k-mode product (k = 1, 2, 3) of $\mathcal{T}$ by $U^{(k)}$ is denoted as $\mathcal{T} \times U^{(k)}$ with entries as

: $\begin{align}
(\mathcal{T} \times_{1} U^{(1)})(i_{1}, j_{2}, j_{3}) &= \sum_{j_{1}=1}^{d_{1}} \mathcal{T}(j_{1}, j_{2}, j_{3})U^{(1)}(j_{1}, i_{1}) \\
(\mathcal{T} \times_{2} U^{(2)})(j_{1}, i_{2}, j_{3}) &= \sum_{j_{2}=1}^{d_{2}} \mathcal{T}(j_{1}, j_{2}, j_{3})U^{(2)}(j_{2}, i_{2}) \\
(\mathcal{T} \times_{3} U^{(3)})(j_{1}, j_{2}, i_{3}) &= \sum_{j_{3}=1}^{d_{3}} \mathcal{T}(j_{1}, j_{2}, j_{3})U^{(3)}(j_{3}, i_{3})
\end{align}$

Altogether, the decomposition may also be written more directly as

: $T(i_{1}, i_{2}, i_{3})
=
\sum_{j_{1}=1}^{d_{1}}
\sum_{j_{2}=1}^{d_{2}}
\sum_{j_{3}=1}^{d_{3}}
\mathcal{T}(j_{1}, j_{2}, j_{3})
U^{(1)}(j_{1}, i_{1})
U^{(2)}(j_{2}, i_{2})
U^{(3)}(j_{3}, i_{3})$

Taking $d_i = n_i$ for all $i$ is always sufficient to represent $T$ exactly, but often $T$ can be compressed or efficiently approximately by choosing $d_i < n_i$ . A common choice is $d_1 = d_2 = d_3 = \min(n_1, n_2, n_3)$ , which can be effective when the difference in dimension sizes is large.

There are two special cases of Tucker decomposition:

Tucker1: if $U^{(2)}$ and $U^{(3)}$ are identity, then $T = \mathcal{T} \times_{1} U^{(1)}$

Tucker2: if $U^{(3)}$ is identity, then $T = \mathcal{T} \times_{1} U^{(1)} \times_{2} U^{(2)}$ .

RESCAL decomposition {{cite conference

| url =

| conference = ICML

| title = A Three-Way Model for Collective Learning on Multi-Relational Data.

| last1 = Nickel

| first1 = Maximilian

| author-link1 =

| last2 = Tresp

| first2 = Volker

| author-link2 =

| last3 = Kriegel

| first3 = Hans-Peter

| author-link3 =

| date = 28 June 2011

| publisher =

| book-title =

| volume = 11

| pages = 809–816

| location =

| id =

}} can be seen as a special case of Tucker where $U^{(3)}$ is identity and $U^{(1)}$ is equal to $U^{(2)}$ .

References

Category:Dimension reduction

Tucker decomposition

See also

References