Matrix difference equation

{{short description|Relation of a matrix of variables between two points in time}}

{{further|Linear recurrence with constant coefficients}}

A matrix difference equation is a difference equation in which the value of a vector (or sometimes, a matrix) of variables at one point in time is related to its own value at one or more previous points in time, using matrices.{{cite book|last1=Cull|first1=Paul|author2-link=Mary Flahive|last2=Flahive|first2=Mary|last3=Robson|first3=Robbie|title=Difference Equations: From Rabbits to Chaos|title-link= Difference Equations: From Rabbits to Chaos |publisher=Springer|date=2005|at=ch. 7|isbn=0-387-23234-6}}{{cite book|last=Chiang|first=Alpha C.|title=Fundamental Methods of Mathematical Economics|url=https://archive.org/details/fundamentalmetho0000chia_h4v2|url-access=registration|edition=3rd|publisher=McGraw-Hill|date=1984|pages=[https://archive.org/details/fundamentalmetho0000chia_h4v2/page/608 608–612]|isbn=9780070107809 }} The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

:\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{Bx}_{t-2}

is an example of a second-order matrix difference equation, in which {{math|x}} is an {{math|n × 1}} vector of variables and {{math|A}} and {{math|B}} are {{math|n × n}} matrices. This equation is homogeneous because there is no vector constant term added to the end of the equation. The same equation might also be written as

:\mathbf x_{t+2} = \mathbf{Ax}_{t+1} + \mathbf{Bx}_{t}

or as

:\mathbf x_n = \mathbf{Ax}_{n-1} + \mathbf{Bx}_{n-2}

The most commonly encountered matrix difference equations are first-order.

Nonhomogeneous first-order case and the steady state

An example of a nonhomogeneous first-order matrix difference equation is

:\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{b}

with additive constant vector {{math|b}}. The steady state of this system is a value {{math|x*}} of the vector {{math|x}} which, if reached, would not be deviated from subsequently. {{math|x*}} is found by setting {{math|xt {{=}} xt−1 {{=}} x*}} in the difference equation and solving for {{math|x*}} to obtain

: \mathbf{x}^* = [\mathbf{I}-\mathbf{A}]^{-1}\mathbf{b}

where {{math|I}} is the {{math|n × n}} identity matrix, and where it is assumed that {{math|[IA]}} is invertible. Then the nonhomogeneous equation can be rewritten in homogeneous form in terms of deviations from the steady state:

: \left[\mathbf{x}_t - \mathbf{x}^*\right] = \mathbf{A}\left[\mathbf{x}_{t-1}-\mathbf{x}^*\right]

Stability of the first-order case

The first-order matrix difference equation {{math|[xtx*] {{=}} A[xt−1x*]}} is stable—that is, {{math|xt}} converges asymptotically to the steady state {{math|x*}}—if and only if all eigenvalues of the transition matrix {{math|A}} (whether real or complex) have an absolute value which is less than 1.

Solution of the first-order case

Assume that the equation has been put in the homogeneous form {{math|yt {{=}} Ayt−1}}. Then we can iterate and substitute repeatedly from the initial condition {{math|y0}}, which is the initial value of the vector {{math|y}} and which must be known in order to find the solution:

:\begin{align}

\mathbf y_1 &= \mathbf{Ay}_0 \\

\mathbf y_2 &= \mathbf{Ay}_1=\mathbf A^2 \mathbf y_0 \\

\mathbf y_3 &= \mathbf{Ay}_2=\mathbf A^3 \mathbf y_0

\end{align}

and so forth, so that by mathematical induction the solution in terms of {{mvar|t}} is

:\mathbf y_t=\mathbf A^t \mathbf y_0

Further, if {{math|A}} is diagonalizable, we can rewrite {{math|A}} in terms of its eigenvalues and eigenvectors, giving the solution as

:\mathbf y_t = \mathbf{PD}^{t}\mathbf{P}^{-1} \mathbf y_0,

where {{math|P}} is an {{math|n × n}} matrix whose columns are the eigenvectors of {{math|A}} (assuming the eigenvalues are all distinct) and {{math|D}} is an {{math|n × n}} diagonal matrix whose diagonal elements are the eigenvalues of {{math|A}}. This solution motivates the above stability result: {{math|At}} shrinks to the zero matrix over time if and only if the eigenvalues of {{mvar|A}} are all less than unity in absolute value.

Extracting the dynamics of a single scalar variable from a first-order matrix system

Starting from the {{math|n}}-dimensional system {{math|yt {{=}} Ayt−1}}, we can extract the dynamics of one of the state variables, say {{math|y1}}. The above solution equation for {{mvar|yt}} shows that the solution for {{math|y1,t}} is in terms of the {{mvar|n}} eigenvalues of {{math|A}}. Therefore the equation describing the evolution of {{math|y1}} by itself must have a solution involving those same eigenvalues. This description intuitively motivates the equation of evolution of {{math|y1}}, which is

: y_{1,t} = a_1 y_{1,t-1} + a_2 y_{1,t-2} + \dots + a_n y_{1,t-n}

where the parameters {{mvar|ai}} are from the characteristic equation of the matrix {{math|A}}:

:\lambda^{n} - a_1 \lambda^{n-1} - a_2 \lambda^{n-2} - \dots - a_n \lambda^{0} = 0.

Thus each individual scalar variable of an {{mvar|n}}-dimensional first-order linear system evolves according to a univariate {{mvar|n}}th-degree difference equation, which has the same stability property (stable or unstable) as does the matrix difference equation.

Solution and stability of higher-order cases

Matrix difference equations of higher order—that is, with a time lag longer than one period—can be solved, and their stability analyzed, by converting them into first-order form using a block matrix (matrix of matrices). For example, suppose we have the second-order equation

:\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{Bx}_{t-2}

with the variable vector {{math|x}} being {{math|n × 1}} and {{math|A}} and {{math|B}} being {{math|n × n}}. This can be stacked in the form

:\begin{bmatrix}\mathbf{x}_t \\ \mathbf{x}_{t-1} \\ \end{bmatrix} = \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{I} & \mathbf{0} \\ \end{bmatrix} \begin{bmatrix} \mathbf{x}_{t-1} \\ \mathbf{x}_{t-2} \end{bmatrix}

where {{math|I}} is the {{math|n × n}} identity matrix and {{math|0}} is the {{math|n × n}} zero matrix. Then denoting the {{math|2n × 1}} stacked vector of current and once-lagged variables as {{math|zt}} and the {{math|2n × 2n}} block matrix as {{math|L}}, we have as before the solution

:\mathbf z_t = \mathbf L^t \mathbf z_0

Also as before, this stacked equation, and thus the original second-order equation, are stable if and only if all eigenvalues of the matrix {{math|L}} are smaller than unity in absolute value.

Nonlinear matrix difference equations: Riccati equations

In linear-quadratic-Gaussian control, there arises a nonlinear matrix equation for the reverse evolution of a current-and-future-cost matrix, denoted below as {{math|H}}. This equation is called a discrete dynamic Riccati equation, and it arises when a variable vector evolving according to a linear matrix difference equation is controlled by manipulating an exogenous vector in order to optimize a quadratic cost function. This Riccati equation assumes the following, or a similar, form:

: \mathbf{H}_{t-1} = \mathbf{K} +\mathbf{A}'\mathbf{H}_t\mathbf{A} - \mathbf{A}'\mathbf{H}_t\mathbf{C}\left[\mathbf{C}'\mathbf{H}_t\mathbf{C}+\mathbf{R}\right]^{-1}\mathbf{C}'\mathbf{H}_t\mathbf{A}

where {{math|H}}, {{math|K}}, and {{math|A}} are {{math|n × n}}, {{math|C}} is {{math|n × k}}, {{math|R}} is {{math|k × k}}, {{math|n}} is the number of elements in the vector to be controlled, and {{math|k}} is the number of elements in the control vector. The parameter matrices {{math|A}} and {{math|C}} are from the linear equation, and the parameter matrices {{math|K}} and {{math|R}} are from the quadratic cost function. See here for details.

In general this equation cannot be solved analytically for {{math|Ht}} in terms of {{mvar|t}}; rather, the sequence of values for {{math|Ht}} is found by iterating the Riccati equation. However, it has been shown{{cite journal|last1=Balvers|first1=Ronald J.|last2=Mitchell|first2=Douglas W.|date=2007|title=Reducing the dimensionality of linear quadratic control problems|journal=Journal of Economic Dynamics and Control|volume=31|issue=1|pages=141–159|doi=10.1016/j.jedc.2005.09.013|s2cid=121354131 |url=https://papers.tinbergen.nl/01043.pdf}} that this Riccati equation can be solved analytically if {{math|R {{=}} 0}} and {{math|n {{=}} k + 1}}, by reducing it to a scalar rational difference equation; moreover, for any {{mvar|k}} and {{mvar|n}} if the transition matrix {{math|A}} is nonsingular then the Riccati equation can be solved analytically in terms of the eigenvalues of a matrix, although these may need to be found numerically.{{cite journal|last=Vaughan|first=D. R.|date=1970|title=A nonrecursive algebraic solution for the discrete Riccati equation|journal=IEEE Transactions on Automatic Control|volume=15|issue=5|pages=597–599|doi=10.1109/TAC.1970.1099549}}

In most contexts the evolution of {{math|H}} backwards through time is stable, meaning that {{math|H}} converges to a particular fixed matrix {{math|H*}} which may be irrational even if all the other matrices are rational. See also {{section link|Stochastic control#Discrete time}}.

A related Riccati equation{{cite book|last1=Martin|first1=C. F.|last2=Ammar|first2=G.|contribution=The geometry of the matrix Riccati equation and associated eigenvalue method|editor1-last=Bittani|editor2-last=Laub|editor3-last=Willems|date=1991|title=The Riccati Equation|publisher=Springer-Verlag|isbn=978-3-642-63508-3|doi=10.1007/978-3-642-58223-3_5}} is

:\mathbf{X}_{t+1} = -\left[\mathbf{E}+\mathbf{B}\mathbf{X}_t\right]\left[\mathbf{C}+\mathbf{A}\mathbf{X}_t\right]^{-1}

in which the matrices {{math|X, A, B, C, E}} are all {{math|n × n}}. This equation can be solved explicitly. Suppose \mathbf X_t = \mathbf N_t \mathbf D_t^{-1}, which certainly holds for {{math|t {{=}} 0}} with {{math|N0 {{=}} X0}} and with {{math|D0 {{=}} I}}. Then using this in the difference equation yields

:\begin{align}

\mathbf{X}_{t+1}&=-\left[\mathbf{E}+\mathbf{BN}_t\mathbf{D}_t^{-1}\right]\mathbf{D}_t\mathbf{D}_t^{-1}\left[\mathbf{C}+\mathbf{AN}_t\mathbf{D}_t^{-1}\right]^{-1}\\

&=-\left[\mathbf{ED}_t+\mathbf{BN}_t\right]\left[\left[\mathbf{C}+\mathbf{AN}_t \mathbf{D}_t^{-1}\right]\mathbf{D}_t\right]^{-1}\\

&=-\left[\mathbf{ED}_t+\mathbf{BN}_t\right]\left[\mathbf{CD}_t+\mathbf{AN}_t\right]^{-1}\\

&=\mathbf{N}_{t+1}\mathbf{D}_{t+1}^{-1}

\end{align}

so by induction the form \mathbf X_t = \mathbf N_t \mathbf D_t^{-1} holds for all {{mvar|t}}. Then the evolution of {{math|N}} and {{math|D}} can be written as

:\begin{bmatrix} \mathbf{N}_{t+1} \\ \mathbf{D}_{t+1} \end{bmatrix} = \begin{bmatrix} -\mathbf{B} & -\mathbf{E} \\ \mathbf{A} & \mathbf{C} \end{bmatrix} \begin{bmatrix} \mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix} \equiv \mathbf{J} \begin{bmatrix}\mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix}

Thus by induction

:\begin{bmatrix} \mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix} = \mathbf{J}^{t} \begin{bmatrix} \mathbf{N}_0 \\ \mathbf{D}_0 \end{bmatrix}

See also

References