Bartels–Stewart algorithm#The Hessenberg–Schur algorithm

In numerical linear algebra, the Bartels–Stewart algorithm is used to numerically solve the Sylvester matrix equation $AX - XB = C$ . Developed by R.H. Bartels and G.W. Stewart in 1971,{{Cite journal|last1=Bartels|first1=R. H.|last2=Stewart|first2=G. W.|date=1972|title=Solution of the matrix equation AX + XB = C [F4]|journal=Communications of the ACM|volume=15|issue=9|pages=820–826|doi=10.1145/361573.361582|issn=0001-0782|doi-access=free}} it was the first numerically stable method that could be systematically applied to solve such equations. The algorithm works by using the real Schur decompositions of $A$ and $B$ to transform $AX - XB = C$ into a triangular system that can then be solved using forward or backward substitution. In 1979, G. Golub, C. Van Loan and S. Nash introduced an improved version of the algorithm,{{Cite journal|last1=Golub|first1=G.|last2=Nash|first2=S.|last3=Loan|first3=C. Van|date=1979|title=A Hessenberg–Schur method for the problem AX + XB= C|journal=IEEE Transactions on Automatic Control|volume=24|issue=6|pages=909–913|doi=10.1109/TAC.1979.1102170|issn=0018-9286|hdl=1813/7472|hdl-access=free}} known as the Hessenberg–Schur algorithm. It remains a standard approach for solving Sylvester equations when $X$ is of small to moderate size.

The algorithm

Let $X, C \in \mathbb{R}^{m \times n}$ , and assume that the eigenvalues of $A$ are distinct from the eigenvalues of $B$ . Then, the matrix equation $AX - XB = C$ has a unique solution. The Bartels–Stewart algorithm computes $X$ by applying the following steps:

1.Compute the real Schur decompositions

: $R = U^TAU,$

: $S = V^TB^TV.$

The matrices $R$ and $S$ are block-upper triangular matrices, with diagonal blocks of size $1 \times 1$ or $2 \times 2$ .

2. Set $F = U^TCV.$

3. Solve the simplified system $RY - YS^T = F$ , where $Y = U^TXV$ . This can be done using forward substitution on the blocks. Specifically, if $s_{k-1, k} = 0$ , then

: $(R - s_{kk}I)y_k = f_{k} + \sum_{j = k+1}^n s_{kj}y_j,$

where $y_k$ is the $k$ th column of $Y$ . When $s_{k-1, k} \neq 0$ , columns $[ y_{k-1} \mid y_{k}]$ should be concatenated and solved for simultaneously.

4. Set $X = UYV^T.$

= Computational cost =

Using the QR algorithm, the real Schur decompositions in step 1 require approximately $10(m^3 + n^3)$ flops, so that the overall computational cost is $10(m^3 + n^3) + 2.5(mn^2 + nm^2)$ .

= Simplifications and special cases =

In the special case where $B=-A^T$ and $C$ is symmetric, the solution $X$ will also be symmetric. This symmetry can be exploited so that $Y$ is found more efficiently in step 3 of the algorithm.

The Hessenberg–Schur algorithm

The Hessenberg–Schur algorithm replaces the decomposition $R = U^TAU$ in step 1 with the decomposition $H = Q^TAQ$ , where $H$ is an upper-Hessenberg matrix. This leads to a system of the form $HY - YS^T = F$ that can be solved using forward substitution. The advantage of this approach is that $H = Q^TAQ$ can be found using Householder reflections at a cost of $(5/3)m^3$ flops, compared to the $10m^3$ flops required to compute the real Schur decomposition of $A$ .

Software and implementation

The subroutines required for the Hessenberg-Schur variant of the Bartels–Stewart algorithm are implemented in the SLICOT library. These are used in the MATLAB control system toolbox.

Alternative approaches

For large systems, the $\mathcal{O}(m^3 + n^3)$ cost of the Bartels–Stewart algorithm can be prohibitive. When $A$ and $B$ are sparse or structured, so that linear solves and matrix vector multiplies involving them are efficient, iterative algorithms can potentially perform better. These include projection-based methods, which use Krylov subspace iterations, methods based on the alternating direction implicit (ADI) iteration, and hybridizations that involve both projection and ADI.{{Cite journal|last=Simoncini|first=V.|s2cid=17271167|date=2016|title=Computational Methods for Linear Matrix Equations|journal=SIAM Review|language=en-US|volume=58|issue=3|pages=377–441|doi=10.1137/130912839|issn=0036-1445|hdl=11585/586011|hdl-access=free}} Iterative methods can also be used to directly construct low rank approximations to $X$ when solving $AX-XB = C$ .

References

Category:Algorithms

Category:Control theory

Category:Matrices (mathematics)

Category:Numerical linear algebra