Matrix difference equation

{{short description|Relation of a matrix of variables between two points in time}}

A matrix difference equation is a difference equation in which the value of a vector (or sometimes, a matrix) of variables at one point in time is related to its own value at one or more previous points in time, using matrices.{{cite book|last1=Cull|first1=Paul|author2-link=Mary Flahive|last2=Flahive|first2=Mary|last3=Robson|first3=Robbie|title=Difference Equations: From Rabbits to Chaos|title-link= Difference Equations: From Rabbits to Chaos |publisher=Springer|date=2005|at=ch. 7|isbn=0-387-23234-6}}{{cite book|last=Chiang|first=Alpha C.|title=Fundamental Methods of Mathematical Economics|url=https://archive.org/details/fundamentalmetho0000chia_h4v2|url-access=registration|edition=3rd|publisher=McGraw-Hill|date=1984|pages=[https://archive.org/details/fundamentalmetho0000chia_h4v2/page/608 608–612]|isbn=9780070107809 }} The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

: $\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{Bx}_{t-2}$

is an example of a second-order matrix difference equation, in which {{math|x}} is an {{math|n × 1}} vector of variables and {{math|A}} and {{math|B}} are {{math|n × n}} matrices. This equation is homogeneous because there is no vector constant term added to the end of the equation. The same equation might also be written as

: $\mathbf x_{t+2} = \mathbf{Ax}_{t+1} + \mathbf{Bx}_{t}$

or as

: $\mathbf x_n = \mathbf{Ax}_{n-1} + \mathbf{Bx}_{n-2}$

The most commonly encountered matrix difference equations are first-order.

Nonhomogeneous first-order case and the steady state

An example of a nonhomogeneous first-order matrix difference equation is

: $\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{b}$

with additive constant vector {{math|b}}. The steady state of this system is a value {{math|x*}} of the vector {{math|x}} which, if reached, would not be deviated from subsequently. {{math|x*}} is found by setting {{math|x_t {{=}} x_t−1 {{=}} x*}} in the difference equation and solving for {{math|x*}} to obtain

: $\mathbf{x}^* = [\mathbf{I}-\mathbf{A}]^{-1}\mathbf{b}$

where {{math|I}} is the {{math|n × n}} identity matrix, and where it is assumed that {{math|[I − A]}} is invertible. Then the nonhomogeneous equation can be rewritten in homogeneous form in terms of deviations from the steady state:

: $\left[\mathbf{x}_t - \mathbf{x}^*\right] = \mathbf{A}\left[\mathbf{x}_{t-1}-\mathbf{x}^*\right]$

Stability of the first-order case

The first-order matrix difference equation {{math|[x_t − x*] {{=}} A[x_t−1 − x*]}} is stable—that is, {{math|x_t}} converges asymptotically to the steady state {{math|x*}}—if and only if all eigenvalues of the transition matrix {{math|A}} (whether real or complex) have an absolute value which is less than 1.

Solution of the first-order case

Assume that the equation has been put in the homogeneous form {{math|y_t {{=}} Ay_t−1}}. Then we can iterate and substitute repeatedly from the initial condition {{math|y₀}}, which is the initial value of the vector {{math|y}} and which must be known in order to find the solution:

: $\begin{align}
\mathbf y_1 &= \mathbf{Ay}_0 \\
\mathbf y_2 &= \mathbf{Ay}_1=\mathbf A^2 \mathbf y_0 \\
\mathbf y_3 &= \mathbf{Ay}_2=\mathbf A^3 \mathbf y_0
\end{align}$

and so forth, so that by mathematical induction the solution in terms of {{mvar|t}} is

: $\mathbf y_t=\mathbf A^t \mathbf y_0$

Further, if {{math|A}} is diagonalizable, we can rewrite {{math|A}} in terms of its eigenvalues and eigenvectors, giving the solution as

: $\mathbf y_t = \mathbf{PD}^{t}\mathbf{P}^{-1} \mathbf y_0,$

where {{math|P}} is an {{math|n × n}} matrix whose columns are the eigenvectors of {{math|A}} (assuming the eigenvalues are all distinct) and {{math|D}} is an {{math|n × n}} diagonal matrix whose diagonal elements are the eigenvalues of {{math|A}}. This solution motivates the above stability result: {{math|A^t}} shrinks to the zero matrix over time if and only if the eigenvalues of {{mvar|A}} are all less than unity in absolute value.

Extracting the dynamics of a single scalar variable from a first-order matrix system

Starting from the {{math|n}}-dimensional system {{math|y_t {{=}} Ay_t−1}}, we can extract the dynamics of one of the state variables, say {{math|y₁}}. The above solution equation for {{mvar|y_t}} shows that the solution for {{math|y_1,t}} is in terms of the {{mvar|n}} eigenvalues of {{math|A}}. Therefore the equation describing the evolution of {{math|y₁}} by itself must have a solution involving those same eigenvalues. This description intuitively motivates the equation of evolution of {{math|y₁}}, which is

: $y_{1,t} = a_1 y_{1,t-1} + a_2 y_{1,t-2} + \dots + a_n y_{1,t-n}$

where the parameters {{mvar|a_i}} are from the characteristic equation of the matrix {{math|A}}:

: $\lambda^{n} - a_1 \lambda^{n-1} - a_2 \lambda^{n-2} - \dots - a_n \lambda^{0} = 0.$

Thus each individual scalar variable of an {{mvar|n}}-dimensional first-order linear system evolves according to a univariate {{mvar|n}}th-degree difference equation, which has the same stability property (stable or unstable) as does the matrix difference equation.

Solution and stability of higher-order cases

Matrix difference equations of higher order—that is, with a time lag longer than one period—can be solved, and their stability analyzed, by converting them into first-order form using a block matrix (matrix of matrices). For example, suppose we have the second-order equation

: $\mathbf x_t = \mathbf{Ax}_{t-1} + \mathbf{Bx}_{t-2}$

with the variable vector {{math|x}} being {{math|n × 1}} and {{math|A}} and {{math|B}} being {{math|n × n}}. This can be stacked in the form

: $\begin{bmatrix}\mathbf{x}_t \\ \mathbf{x}_{t-1} \\ \end{bmatrix} = \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{I} & \mathbf{0} \\ \end{bmatrix} \begin{bmatrix} \mathbf{x}_{t-1} \\ \mathbf{x}_{t-2} \end{bmatrix}$

where {{math|I}} is the {{math|n × n}} identity matrix and {{math|0}} is the {{math|n × n}} zero matrix. Then denoting the {{math|2n × 1}} stacked vector of current and once-lagged variables as {{math|z_t}} and the {{math|2n × 2n}} block matrix as {{math|L}}, we have as before the solution

: $\mathbf z_t = \mathbf L^t \mathbf z_0$

Also as before, this stacked equation, and thus the original second-order equation, are stable if and only if all eigenvalues of the matrix {{math|L}} are smaller than unity in absolute value.

Nonlinear matrix difference equations: Riccati equations

In linear-quadratic-Gaussian control, there arises a nonlinear matrix equation for the reverse evolution of a current-and-future-cost matrix, denoted below as {{math|H}}. This equation is called a discrete dynamic Riccati equation, and it arises when a variable vector evolving according to a linear matrix difference equation is controlled by manipulating an exogenous vector in order to optimize a quadratic cost function. This Riccati equation assumes the following, or a similar, form:

: $\mathbf{H}_{t-1} = \mathbf{K} +\mathbf{A}'\mathbf{H}_t\mathbf{A} - \mathbf{A}'\mathbf{H}_t\mathbf{C}\left[\mathbf{C}'\mathbf{H}_t\mathbf{C}+\mathbf{R}\right]^{-1}\mathbf{C}'\mathbf{H}_t\mathbf{A}$

where {{math|H}}, {{math|K}}, and {{math|A}} are {{math|n × n}}, {{math|C}} is {{math|n × k}}, {{math|R}} is {{math|k × k}}, {{math|n}} is the number of elements in the vector to be controlled, and {{math|k}} is the number of elements in the control vector. The parameter matrices {{math|A}} and {{math|C}} are from the linear equation, and the parameter matrices {{math|K}} and {{math|R}} are from the quadratic cost function. See here for details.

In general this equation cannot be solved analytically for {{math|H_t}} in terms of {{mvar|t}}; rather, the sequence of values for {{math|H_t}} is found by iterating the Riccati equation. However, it has been shown{{cite journal|last1=Balvers|first1=Ronald J.|last2=Mitchell|first2=Douglas W.|date=2007|title=Reducing the dimensionality of linear quadratic control problems|journal=Journal of Economic Dynamics and Control|volume=31|issue=1|pages=141–159|doi=10.1016/j.jedc.2005.09.013|s2cid=121354131 |url=https://papers.tinbergen.nl/01043.pdf}} that this Riccati equation can be solved analytically if {{math|R {{=}} 0}} and {{math|n {{=}} k + 1}}, by reducing it to a scalar rational difference equation; moreover, for any {{mvar|k}} and {{mvar|n}} if the transition matrix {{math|A}} is nonsingular then the Riccati equation can be solved analytically in terms of the eigenvalues of a matrix, although these may need to be found numerically.{{cite journal|last=Vaughan|first=D. R.|date=1970|title=A nonrecursive algebraic solution for the discrete Riccati equation|journal=IEEE Transactions on Automatic Control|volume=15|issue=5|pages=597–599|doi=10.1109/TAC.1970.1099549}}

In most contexts the evolution of {{math|H}} backwards through time is stable, meaning that {{math|H}} converges to a particular fixed matrix {{math|H*}} which may be irrational even if all the other matrices are rational. See also {{section link|Stochastic control#Discrete time}}.

A related Riccati equation{{cite book|last1=Martin|first1=C. F.|last2=Ammar|first2=G.|contribution=The geometry of the matrix Riccati equation and associated eigenvalue method|editor1-last=Bittani|editor2-last=Laub|editor3-last=Willems|date=1991|title=The Riccati Equation|publisher=Springer-Verlag|isbn=978-3-642-63508-3|doi=10.1007/978-3-642-58223-3_5}} is

: $\mathbf{X}_{t+1} = -\left[\mathbf{E}+\mathbf{B}\mathbf{X}_t\right]\left[\mathbf{C}+\mathbf{A}\mathbf{X}_t\right]^{-1}$

in which the matrices {{math|X, A, B, C, E}} are all {{math|n × n}}. This equation can be solved explicitly. Suppose $\mathbf X_t = \mathbf N_t \mathbf D_t^{-1},$ which certainly holds for {{math|t {{=}} 0}} with {{math|N₀ {{=}} X₀}} and with {{math|D₀ {{=}} I}}. Then using this in the difference equation yields

: $\begin{align}
\mathbf{X}_{t+1}&=-\left[\mathbf{E}+\mathbf{BN}_t\mathbf{D}_t^{-1}\right]\mathbf{D}_t\mathbf{D}_t^{-1}\left[\mathbf{C}+\mathbf{AN}_t\mathbf{D}_t^{-1}\right]^{-1}\\
&=-\left[\mathbf{ED}_t+\mathbf{BN}_t\right]\left[\left[\mathbf{C}+\mathbf{AN}_t \mathbf{D}_t^{-1}\right]\mathbf{D}_t\right]^{-1}\\
&=-\left[\mathbf{ED}_t+\mathbf{BN}_t\right]\left[\mathbf{CD}_t+\mathbf{AN}_t\right]^{-1}\\
&=\mathbf{N}_{t+1}\mathbf{D}_{t+1}^{-1}
\end{align}$

so by induction the form $\mathbf X_t = \mathbf N_t \mathbf D_t^{-1}$ holds for all {{mvar|t}}. Then the evolution of {{math|N}} and {{math|D}} can be written as

: $\begin{bmatrix} \mathbf{N}_{t+1} \\ \mathbf{D}_{t+1} \end{bmatrix} = \begin{bmatrix} -\mathbf{B} & -\mathbf{E} \\ \mathbf{A} & \mathbf{C} \end{bmatrix} \begin{bmatrix} \mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix} \equiv \mathbf{J} \begin{bmatrix}\mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix}$

Thus by induction

: $\begin{bmatrix} \mathbf{N}_t \\ \mathbf{D}_t \end{bmatrix} = \mathbf{J}^{t} \begin{bmatrix} \mathbf{N}_0 \\ \mathbf{D}_0 \end{bmatrix}$

References

Category:Linear algebra

Category:Matrices (mathematics)

Category:Recurrence relations

Category:Dynamical systems