Factor regression model

Within statistical factor analysis, the factor regression model,{{cite journal|last=Carvalho|first=Carlos M.|title=High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics|journal=Journal of the American Statistical Association|date=1 December 2008|volume=103|issue=484|pages=1438–1456|doi=10.1198/016214508000000869|pmc=3017385|pmid=21218139}} or hybrid factor model,{{cite journal|last=Meng |first=J. |title=Uncover cooperative gene regulations by microRNAs and transcription factors in glioblastoma using a nonnegative hybrid factor model |journal=International Conference on Acoustics, Speech and Signal Processing |year=2011 |url=http://www.cmsworldwide.com/ICASSP2011/Papers/ViewPapers.asp?PaperNum=4439 |url-status=dead |archiveurl=https://web.archive.org/web/20111123144133/http://www.cmsworldwide.com/ICASSP2011/Papers/ViewPapers.asp?PaperNum=4439 |archivedate=2011-11-23 }} is a special multivariate model with the following form:

: \mathbf{y}_n= \mathbf{A}\mathbf{x}_n+ \mathbf{B}\mathbf{z}_n +\mathbf{c}+\mathbf{e}_n

where,

: \mathbf{y}_n is the n-th G \times 1 (known) observation.

: \mathbf{x}_n is the n-th sample L_x (unknown) hidden factors.

: \mathbf{A} is the (unknown) loading matrix of the hidden factors.

: \mathbf{z}_n is the n-th sample L_z (known) design factors.

: \mathbf{B} is the (unknown) regression coefficients of the design factors.

: \mathbf{c} is a vector of (unknown) constant term or intercept.

: \mathbf{e}_n is a vector of (unknown) errors, often white Gaussian noise.

Relationship between factor regression model, factor model and regression model

The factor regression model can be viewed as a combination of factor analysis model ( \mathbf{y}_n= \mathbf{A}\mathbf{x}_n+ \mathbf{c}+\mathbf{e}_n ) and regression model ( \mathbf{y}_n= \mathbf{B}\mathbf{z}_n +\mathbf{c}+\mathbf{e}_n ).

Alternatively, the model can be viewed as a special kind of factor model, the hybrid factor model

:

\begin{align}

& \mathbf{y}_n = \mathbf{A}\mathbf{x}_n+ \mathbf{B}\mathbf{z}_n +\mathbf{c}+\mathbf{e}_n \\

= & \begin{bmatrix}

\mathbf{A} & \mathbf{B}

\end{bmatrix}

\begin{bmatrix}

\mathbf{x}_n \\

\mathbf{z}_n\end{bmatrix} +\mathbf{c}+\mathbf{e}_n \\

= & \mathbf{D}\mathbf{f}_n +\mathbf{c}+\mathbf{e}_n

\end{align}

where, \mathbf{D}=\begin{bmatrix}

\mathbf{A} & \mathbf{B}

\end{bmatrix} is the loading matrix of the hybrid factor model and \mathbf{f}_n=\begin{bmatrix}

\mathbf{x}_n \\

\mathbf{z}_n\end{bmatrix} are the factors, including the known factors and unknown factors.

Software

[https://www2.stat.duke.edu/~mw/mwsoftware/BFRM/index.html Open source software to perform factor regression is available].

References