We have recently^{[1][2]} been looking at how we can use a special case of Francis's QR transformation^{[3][4]} to reduce a real symmetric matrix \(\mathbf{M}\) to a diagonal matrix \(\boldsymbol{\Lambda}\) by first applying Householder transformations to put it in tridiagonal form and then using shifted Givens rotations to zero out the off diagonal elements.
The columns of the matrix of transformations \(\mathbf{V}\) and the elements on the leading diagonal of \(\boldsymbol{\Lambda}\) are the unit eigenvectors and eigenvalues of \(\mathbf{M}\) respectively and they consequently satisfy
Last time^{[5]} we saw how we could efficiently apply the Householder transformations inplace, replacing the elements of \(\mathbf{M}\) with those of the matrix of accumulated transformations \(\mathbf{Q}\) and creating a pair of arrays to represent the leading and off diagonal elements of the tridiagonal matrix. This time we shall see how we can similarly improve the implementation of the Givens rotations.
To set the last off diagonal elements of a tridiagonal matrix \(\mathbf{M}\) to zero we iteratively applied such sequences of rotations with
The iterative shifting of the pair of nonzero elements above and below the upper and lower diagonals down toward and off of the bottom right corner of \(\mathbf{M}\) is known as chasing the bulge.
as demonstrated by program 2.
with a pair that set bounds on the iteration that only go out as far as the bulge
as show by program 3.
By the rules of matrix multiplication, the elements of
During the initial rotation the elements \(g_{i0}\) equal zero for \(i\) greater than one and so the top left element of \(\mathbf{M}^\prime\) is given by
The only nonzero element in the third row and column of \(\mathbf{G}\) is a one on the leading diagonal and so the bulge equals
which program 4 shows yields the same results.
Expanding the sums in the second Givens rotation proceeds in much the same way, although this time the only nonzero elements in the first row and column and those after the third of \(\mathbf{G}\) are ones on the leading diagonal and the second and third rows and columns are zero everywhere except in the second and third places.
The first, second and third off diagonal elements of \(\mathbf{M}^\prime\) are therefore given by
Since all subsequent rotations take the same form we can use these expansions for them too, provided that we ignore the bulge when it falls out of the bottom of the matrix, which is done by replacing the loops for them with
in program 5.
Now we don't actually need to store the bulge in \(\mathbf{M}\) at each step since we know that it will always be replaced with a zero. Since we're already copying it into
and removing the statement copying the bulge into
as done in program 6.
Since we know that \(\mathbf{M}\) is symmetric at every step we can simply update its lower diagonal at each step with
and copy its elements into the upper diagonal at the end of the iteration with
as demonstrated by program 7.
We can save ourselves a few multiplication operations by calculating the values of \(\sin^2\theta\), \(\cos^2\theta\), \(\cos\theta \times \sin\theta\) and \(2 \times m_{c+1,c} \times \cos\theta \times \sin\theta\) and saving them in the local variables
as used by program 8.
Whilst this is a reasonably efficient implementation it isn't particularly robust. In particular, if the first row and column of \(\mathbf{M}\) only contain zeros then at the start of the first rotation
Note that the termination criterion has been changed to
to ensure that the program stops if the last off diagonal element is NaN.
In such cases we can simply ignore the first row and column, working instead on the submatrix formed from the remaining ones, since they are already diagonal and we aren't going to need to transform them. In fact, we should skip past every initial row and column that are zero on their off diagonals for the same reason.
To do so we simply replace the initial column of zero with one of
although we now also have to take care not to read and write past the end of \(\mathbf{M}\) by changing the first step of each iteration to
Program 10 shows the results of changing the initial index in this way.
In a similar vein, if there's a zero somewhere further along the off diagonal then we should first transform the submatrix in the rows and columns before it and then transform the one in the rows and columns after it, avoiding another source of NaNs. To do this we can replace the size of the matrix
as done in program 11.
Note that we still have to update all of the rows of \(\mathbf{Q}\), not just those up to the stopping point.
Now, we're currently using the absolute termination criterion
which is unsuitable if the matrix has large elements. A more reasonable approach is to use
which will tend to an absolute comparison when the diagonal elements are small and to a relative comparison to their magnitudes when they are large, as used in program 12.
Finally, since we're no longer using any elements of \(\mathbf{M}\) beyond the upper and lower diagonals we might as well operate directly on the arrays
The
Note that it also sets an upper bound on the number of iterations with the
Program 13 applies it to a matrix with a similar form to those in programs 11 and 12, having zeros in the first row and column and an off diagonal zero before the end, printing the accumulated rotations above the arrays
To diagonalise a symmetric real matrix \(\mathbf{M}\) we repeatedly call the
Note that the program terminates when the offset is no longer less than
Wilkinson and Reinsch's^{[7]} ALGOL procedure
[2] Spryer Francis, www.thusspakeak.com, 2019.
[3] Francis, J., The QR Transformation  Part 1, The Computer Journal, Vol. 4, No. 3, Oxford University Press, 1961.
[4] Francis, J., The QR Transformation  Part 2, The Computer Journal, Vol. 4, No. 4, Oxford University Press, 1961.
[5] A Well Managed Household, www.thusspakeak.com, 2020.
[6] Golub, G. and Van Loan, C., Matrix Computations (3rd ed.), John Hopkins University Press, 1996.
[7] Wilkinson, J. & Reinsch, C., Handbook for Automatic Computation, Volume II: Linear Algebra, Springer, 1971
[8] Garbow, Burton S., EISPACK — A package of matrix eigensystem routines, Computer Physics Communications, Vol. 7, 1974
[9] Press, W.H. et al, Numerical Recipes in C (2nd ed.), Cambridge University Press, 1992.
The columns of the matrix of transformations \(\mathbf{V}\) and the elements on the leading diagonal of \(\boldsymbol{\Lambda}\) are the unit eigenvectors and eigenvalues of \(\mathbf{M}\) respectively and they consequently satisfy
\[
\mathbf{M} \times \mathbf{V} = \mathbf{V} \times \boldsymbol{\Lambda}
\]
and, since the product of \(\mathbf{V}\) and its transpose is the identity matrix
\[
\mathbf{M} = \mathbf{V} \times \boldsymbol{\Lambda} \times \mathbf{V}^\mathrm{T}
\]
which is known as the spectral decomposition of \(\mathbf{M}\).Last time^{[5]} we saw how we could efficiently apply the Householder transformations inplace, replacing the elements of \(\mathbf{M}\) with those of the matrix of accumulated transformations \(\mathbf{Q}\) and creating a pair of arrays to represent the leading and off diagonal elements of the tridiagonal matrix. This time we shall see how we can similarly improve the implementation of the Givens rotations.
Givens Diagonalisation
Recall that Givens rotations are matrices of the form
\[
\mathbf{G} =
\begin{pmatrix}
1 & \cdots & 0 & \cdots & 0 & \cdots & 0 & \cdots & 0\\
\vdots & \ddots & \vdots & & \vdots & & \vdots & & \vdots\\
0 & \cdots & \cos \theta & \cdots & 0 & \cdots & \sin \theta & \cdots & 0\\
\vdots & & \vdots & \ddots & \vdots & & \vdots & & \vdots\\
0 & \cdots & 0 & \cdots & 1 & \cdots & 0 & \cdots & 0\\
\vdots & & \vdots & & \vdots & \ddots & \vdots & & \vdots\\
0 & \cdots & \sin \theta & \cdots & 0 & \cdots & \cos \theta & \cdots & 0\\
\vdots & & \vdots & & \vdots & & \vdots & \ddots & \vdots\\
0 & \cdots & 0 & \cdots & 0 & \cdots & 0 & \cdots & 1\\
\end{pmatrix}
\]
which have the elements
\[
\begin{align*}
g_{ii} &= 1 && i \neq k \wedge i \neq l\\
g_{ii} &= \cos \theta && i = k \vee i = l\\
g_{ij} &= \sin \theta && i = k \wedge j = l\\
g_{ij} &= \sin \theta && i = l \wedge j = k\\
g_{ij} &= 0 && \text{otherwise}
\end{align*}
\]
for an angle \(\theta\) and indices \(k\) less than \(l\), where \(\wedge\) means and and \(\vee\) means or. To efficiently transform the off diagonal elements to zero we used an implicit Wilkinson shift^{[6]}, which for the first and second row and column used an angle that satisfied
\[
\begin{align*}
\sin \theta &= \frac{m_{1,0}}{\sqrt{\left(m_{0,0}s\right)^2 + m_{1,0}^2}}\\
\cos \theta &= \phantom{}\frac{m_{0,0}s}{\sqrt{\left(m_{0,0}s\right)^2 + m_{1,0}^2}}
\end{align*}
\]
where
\[
\begin{align*}
d &= \frac{m_{n2,n2}m_{n1,n1}}{2}\\
s &= m_{n1,n1}  \frac{m_{n1,n2}^2}{d + \mathrm{sign}(d) \times \sqrt{d^2 + m_{n1,n2}^2}}
\end{align*}
\]
and, for subsequent rows and columns, an angle defined by
\[
\begin{align*}
\sin \theta &= \frac{m_{k+1,k1}}{\sqrt{m_{k,k1}^2+m_{k+1,k1}^2}}\\
\cos \theta &= \phantom{}\frac{m_{k,k1}}{\sqrt{m_{k,k1}^2+m_{k+1,k1}^2}}
\end{align*}
\]
for \(k\) from one to \(n2\).To set the last off diagonal elements of a tridiagonal matrix \(\mathbf{M}\) to zero we iteratively applied such sequences of rotations with
\[
\mathbf{M} \leftarrow \mathbf{G}^\mathrm{T} \times \mathbf{M} \times \mathbf{G}
\]
where the leftarrow means replaced with, as demonstrated by program 1 which prints the accumulated rotations \(\mathbf{Q}\) above the transformed matrix \(\mathbf{M}\), tracking the number of iterations and the last off diagonal element, finally printing the distance between the original and transformed matrices.
Program 1: Givens Rotations



The iterative shifting of the pair of nonzero elements above and below the upper and lower diagonals down toward and off of the bottom right corner of \(\mathbf{M}\) is known as chasing the bulge.
Rotating InPlace
To directly operate upon the elements of the matrices rather than use ourak.matrix
operators we should first note that, for a Givens rotation \(\mathbf{G}\) of rows and columns \(k\) and \(l\), the matrix
\[
\mathbf{M}^\prime = \mathbf{G}^\mathrm{T} \times \mathbf{M}
\]
has the elements
\[
m^\prime_{ij} = \sum_{h=0}^{n1} g^\mathrm{T}_{ih} \times m_{hj}
= \sum_{h=0}^{n1} g_{hi} \times m_{hj}
=
\begin{cases}
m_{kj} \times \cos\theta  m_{lj} \times \sin\theta & i = k\\
m_{kj} \times \sin\theta + m_{lj} \times \cos\theta & i = l\\
m_{ij} & \text{otherwise}
\end{cases}
\]
and the matrix
\[
\mathbf{M}^{\prime\prime} = \mathbf{M} \times \mathbf{G}
\]
has
\[
m^{\prime\prime}_{ij} = \sum_{h=0}^{n1} m_{ih} \times g_{hj}
=
\begin{cases}
m_{ik} \times \cos\theta  m_{il} \times \sin\theta & j = k\\
m_{ik} \times \sin\theta + m_{il} \times \cos\theta & j = l\\
m_{ij} & \text{otherwise}
\end{cases}
\]
Now, if we replace the the matrices \(\mathbf{M}\) and \(\mathbf{Q}\) with twodimensional arrays we can update them inplace using the sequence of transformations
\[
\begin{align*}
\mathbf{M} &\leftarrow \mathbf{G}^\mathrm{T} \times \mathbf{M}\\
\mathbf{M} &\leftarrow \mathbf{M} \times \mathbf{G}\\
\mathbf{Q} &\leftarrow \mathbf{Q} \times \mathbf{G}
\end{align*}
\]
instead of the matrix multiplications at each step, which we can implement with
for(cc=0;cc<n;++cc) {
r0 = m[c][cc]*c0  m[c+1][cc]*s0;
r1 = m[c][cc]*s0 + m[c+1][cc]*c0;
m[c][cc] = r0;
m[c+1][cc] = r1;
}
for(rr=0;rr<n;++rr) {
r0 = m[rr][c]*c0  m[rr][c+1]*s0;
r1 = m[rr][c]*s0 + m[rr][c+1]*c0;
m[rr][c] = r0;
m[rr][c+1] = r1;
}
for(rr=0;rr<n;++rr) {
r0 = q[rr][c]*c0  q[rr][c+1]*s0;
r1 = q[rr][c]*s0 + q[rr][c+1]*c0;
q[rr][c] = r0;
q[rr][c+1] = r1;
}
as demonstrated by program 2.
Program 2: InPlace Givens Rotations



Refactoring Away
Now that we're working with arrays rather than matrices we have the freedom to start refactoring the code, removing unnecessary operations. For starters, we know that the matrix \(\mathbf{M}\) is tridiagonal at each step, except for the bulge, and so we're needlessly calculating terms that must equal zero when we multiply it by the Givens rotations. To remove those terms we can simply replace the loops
for(rr=0;rr<n;++rr) {
r0 = m[rr][c]*c0  m[rr][c+1]*s0;
r1 = m[rr][c]*s0 + m[rr][c+1]*c0;
m[rr][c] = r0;
m[rr][c+1] = r1;
}
for(rr=0;rr<n;++rr) {
r0 = q[rr][c]*c0  q[rr][c+1]*s0;
r1 = q[rr][c]*s0 + q[rr][c+1]*c0;
q[rr][c] = r0;
q[rr][c+1] = r1;
}
with a pair that set bounds on the iteration that only go out as far as the bulge
lb = Math.max(c2, 0);
ub = Math.min(c+3, n);
for(cc=lb;cc<ub;++cc) {
r0 = m[c][cc]*c0  m[c+1][cc]*s0;
r1 = m[c][cc]*s0 + m[c+1][cc]*c0;
m[c][cc] = r0;
m[c+1][cc] = r1;
}
lb = Math.max(c2, 0);
ub = Math.min(c+3, n);
for(rr=lb;rr<ub;++rr) {
r0 = m[rr][c]*c0  m[rr][c+1]*s0;
r1 = m[rr][c]*s0 + m[rr][c+1]*c0;
m[rr][c] = r0;
m[rr][c+1] = r1;
}
as show by program 3.
Program 3: No Further Than The Bulge



By the rules of matrix multiplication, the elements of
\[
\mathbf{M}^{\prime} = \mathbf{G}^\mathrm{T} \times \mathbf{M} \times \mathbf{G}
\]
are given by
\[
m^{\prime}_{rc}
= \sum_{i=0}^{n1} \sum_{j=0}^{n1} g^\mathrm{T}_{ri} \times m_{ij} \times g_{jc}
= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{ir} \times g_{jc}
\]
We can consequently calculate the result of a Givens rotation by expanding these sums, ignoring any terms that we know must equal zero as we do so.During the initial rotation the elements \(g_{i0}\) equal zero for \(i\) greater than one and so the top left element of \(\mathbf{M}^\prime\) is given by
\[
\begin{align*}
m^{\prime}_{00}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i0} \times g_{j0}
= \sum_{i=0}^1 \sum_{j=0}^1 m_{ij} \times g_{i0} \times g_{j0}\\
&= m_{00} \times g_{00} \times g_{00} + m_{01} \times g_{00} \times g_{10}
+ m_{10} \times g_{10} \times g_{00} + m_{11} \times g_{10} \times g_{10}\\
&= m_{00} \times \cos^2\theta  m_{01} \times \cos\theta \times \sin\theta
 m_{10} \times \sin\theta \times \cos\theta + m_{11} \times \sin^2\theta\\
&= m_{00} \times \cos^2\theta + m_{11} \times \sin^2\theta  2 \times m_{01} \times \cos\theta \times \sin\theta
\end{align*}
\]
Similarly, its first upper diagonal element is
\[
\begin{align*}
m^{\prime}_{01}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i0} \times g_{j1}
= \sum_{i=0}^1 \sum_{j=0}^1 m_{ij} \times g_{i0} \times g_{j1}\\
&= m_{00} \times g_{00} \times g_{01} + m_{01} \times g_{00} \times g_{11}
+ m_{10} \times g_{10} \times g_{01} + m_{11} \times g_{10} \times g_{11}\\
&= m_{00} \times \cos\theta \times \sin\theta + m_{01} \times \cos^2\theta
 m_{01} \times \sin^2\theta  m_{11} \times \sin\theta \times \cos\theta\\
&= \left(m_{00}m_{11}\right) \times \cos\theta \times \sin\theta + m_{01} \times \left(\cos^2\theta
 \sin^2\theta\right)
\end{align*}
\]
and, by symmetry, so is its first lower diagonal element.The only nonzero element in the third row and column of \(\mathbf{G}\) is a one on the leading diagonal and so the bulge equals
\[
\begin{align*}
m^{\prime}_{02}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i0} \times g_{j2}
= \sum_{i=1}^1 \sum_{j=2}^2 m_{ij} \times g_{i0} \times g_{j2}\\
&= m_{12} \times g_{10} \times g_{22}\\
&= m_{12} \times \sin\theta
\end{align*}
\]
The elements \(g_{i1}\) also equal zero for \(i\) greater than one and so the second leading diagonal element comes to
\[
\begin{align*}
m^{\prime}_{11}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j1}
= \sum_{i=0}^1 \sum_{j=0}^1 m_{ij} \times g_{i1} \times g_{j1}\\
&= m_{00} \times g_{01} \times g_{01} + m_{01} \times g_{01} \times g_{11}
+ m_{10} \times g_{11} \times g_{01} + m_{11} \times g_{11} \times g_{11}\\
&= m_{00} \times \sin^2\theta + m_{01} \times \sin\theta \times \cos\theta
+ m_{10} \times \cos\theta \times \sin\theta + m_{11} \times \cos^2\theta\\
&= m_{00} \times \sin^2\theta + m_{11} \times \cos^2\theta + 2 \times m_{01} \times \cos\theta \times \sin\theta
\end{align*}
\]
Finally, the second off diagonal element is given by
\[
\begin{align*}
m^{\prime}_{12}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j2}
= \sum_{i=1}^1 \sum_{j=2}^2 m_{ij} \times g_{i1} \times g_{j2}\\
&= m_{12} \times g_{11} \times g_{22}\\
&= m_{12} \times \cos\theta
\end{align*}
\]
Noting that these are the only elements that are changed during the initial rotation, we can replace the loops for it with
m00 = m[0][0]; m01 = m[0][1];
m11 = m[1][1]; m12 = m[1][2];
m[0][0] = m00*c0*c0 + m11*s0*s0  2*m01*c0*s0;
m[1][1] = m00*s0*s0 + m11*c0*c0 + 2*m01*c0*s0;
m[0][1] = m[1][0] = (m00m11)*c0*s0 + m01*(c0*c0s0*s0);
m[0][2] = m[2][0] = m12*s0;
m[1][2] = m[2][1] = m12*c0;
which program 4 shows yields the same results.
Program 4: Expanding Out The Initial Rotation



Expanding the sums in the second Givens rotation proceeds in much the same way, although this time the only nonzero elements in the first row and column and those after the third of \(\mathbf{G}\) are ones on the leading diagonal and the second and third rows and columns are zero everywhere except in the second and third places.
The first, second and third off diagonal elements of \(\mathbf{M}^\prime\) are therefore given by
\[
\begin{align*}
m^{\prime}_{10}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j0}
= \sum_{i=1}^2 \sum_{j=0}^0 m_{ij} \times g_{i1} \times g_{j0}\\
&= m_{10} \times g_{11} \times g_{00} + m_{20} \times g_{21} \times g_{00}\\
&= m_{01} \times \cos\theta  m_{02} \times \sin\theta\\
\\
m^{\prime}_{12}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j2}
= \sum_{i=1}^2 \sum_{j=1}^2 m_{ij} \times g_{i1} \times g_{j2}\\
&= m_{11} \times g_{11} \times g_{12} + m_{12} \times g_{11} \times g_{22}
+ m_{21} \times g_{21} \times g_{12} + m_{22} \times g_{21} \times g_{22}\\
&= m_{11} \times \cos\theta \times \sin\theta + m_{12} \times \cos^2\theta
 m_{21} \times \sin^2\theta  m_{22} \times \sin\theta \times \cos\theta\\
&= \left(m_{11}m_{22}\right) \times \cos\theta \times \sin\theta + m_{12} \times \left(\cos^2\theta  \sin^2\theta\right)\\
\\
m^{\prime}_{23}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i2} \times g_{j3}
= \sum_{i=2}^2 \sum_{j=3}^3 m_{ij} \times g_{i2} \times g_{j3}\\
&= m_{23} \times g_{22} \times g_{33}\\
&= m_{23} \times \cos\theta
\end{align*}
\]
and the second and third on the leading diagonal by
\[
\begin{align*}
m^{\prime}_{11}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j1}
= \sum_{i=1}^2 \sum_{j=1}^2 m_{ij} \times g_{i1} \times g_{j1}\\
&= m_{11} \times g_{11} \times g_{11} + m_{12} \times g_{11} \times g_{21}
+ m_{21} \times g_{21} \times g_{11} + m_{22} \times g_{21} \times g_{21}\\
&= m_{11} \times \cos^2\theta  m_{12} \times \cos\theta \times \sin\theta
 m_{21} \times \sin\theta \times \cos\theta + m_{22} \times \sin^2\theta\\
&= m_{11} \times \cos^2\theta + m_{22} \times \sin^2\theta  2 \times m_{12} \times \cos\theta \times \sin\theta\\
\\
m^{\prime}_{22}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i2} \times g_{j2}
= \sum_{i=1}^2 \sum_{j=1}^2 m_{ij} \times g_{i2} \times g_{j2}\\
&= m_{11} \times g_{12} \times g_{12} + m_{12} \times g_{12} \times g_{22}
+ m_{21} \times g_{22} \times g_{12} + m_{22} \times g_{22} \times g_{22}\\
&= m_{11} \times \sin^2\theta + m_{12} \times \sin\theta \times \cos\theta
+ m_{21} \times \cos\theta \times \sin\theta + m_{22} \times \cos^2\theta\\
&= m_{11} \times \sin^2\theta + m_{22} \times \cos^2\theta + 2 \times m_{12} \times \cos\theta \times \sin\theta
\end{align*}
\]
Finally, the new bulge equals
\[
\begin{align*}
m^{\prime}_{13}
&= \sum_{i=0}^{n1} \sum_{j=0}^{n1} m_{ij} \times g_{i1} \times g_{j3}
= \sum_{i=2}^2 \sum_{j=3}^3 m_{ij} \times g_{i1} \times g_{j3}\\
&= m_{23} \times g_{21} \times g_{33}\\
&= m_{23} \times \sin\theta
\end{align*}
\]
the old one is replaced with a zero and all remaining elements are the same as those in \(\mathbf{M}\).Since all subsequent rotations take the same form we can use these expansions for them too, provided that we ignore the bulge when it falls out of the bottom of the matrix, which is done by replacing the loops for them with
m01 = m[c1][c]; m02 = m[c1][c+1];
m11 = m[c][c]; m12 = m[c][c+1];
m22 = m[c+1][c+1];
m[c][c] = m11*c0*c0 + m22*s0*s0  2*m12*c0*s0;
m[c+1][c+1] = m11*s0*s0 + m22*c0*c0 + 2*m12*c0*s0;
m[c][c1] = m[c1][c] = m01*c0  m02*s0;
m[c][c+1] = m[c+1][c] = (m11m22)*c0*s0 + m12*(c0*c0s0*s0);
m[c1][c+1] = m[c+1][c1] = 0;
if(c<n2) {
m23 = m[c+1][c+2];
m[c+2][c] = m[c][c+2] = m23*s0;
m[c+2][c+1] = m[c+1][c+2] = m23*c0;
}
in program 5.
Program 5: Expanding Out Subsequent Rotations



Now we don't actually need to store the bulge in \(\mathbf{M}\) at each step since we know that it will always be replaced with a zero. Since we're already copying it into
z
before the second and subsequent rotations we might as well just keep it there, which we can do by replacing the implementation of the rotations with
if(c===0) {
m00 = m[0][0]; m01 = m[0][1];
m11 = m[1][1]; m12 = m[1][2];
m[0][0] = m00*c0*c0 + m11*s0*s0  2*m01*c0*s0;
m[1][1] = m00*s0*s0 + m11*c0*c0 + 2*m01*c0*s0;
m[0][1] = m[1][0] = (m00m11)*c0*s0 + m01*(c0*c0s0*s0);
m[1][2] = m[2][1] = m12*c0;
z = m12*s0;
}
else {
m01 = m[c1][c];
m11 = m[c][c];
m12 = m[c][c+1];
m22 = m[c+1][c+1];
m[c][c] = m11*c0*c0 + m22*s0*s0  2*m12*c0*s0;
m[c+1][c+1] = m11*s0*s0 + m22*c0*c0 + 2*m12*c0*s0;
m[c][c1] = m[c1][c] = m01*c0  z*s0;
m[c][c+1] = m[c+1][c] = (m11m22)*c0*s0 + m12*(c0*c0s0*s0);
if(c<n2) {
m23 = m[c+1][c+2];
m[c+2][c+1] = m[c+1][c+2] = m23*c0;
z = m23*s0;
}
}
and removing the statement copying the bulge into
z
if(c===0) {
if(Math.abs(m[n1][n2])<=e) return false;
x = m[c][c]  wilkinson(m);
z = m[c+1][c];
}
else {
x = m[c][c1];
}
as done in program 6.
Program 6: Keeping The Bulge Separate



Since we know that \(\mathbf{M}\) is symmetric at every step we can simply update its lower diagonal at each step with
if(c===0) {
m00 = m[0][0]; m10 = m[1][0];
m11 = m[1][1]; m21 = m[2][1];
m[0][0] = m00*c0*c0 + m11*s0*s0  2*m10*c0*s0;
m[1][1] = m00*s0*s0 + m11*c0*c0 + 2*m10*c0*s0;
m[1][0] = (m00m11)*c0*s0 + m10*(c0*c0s0*s0);
m[2][1] = m21*c0;
z = m21*s0;
}
else {
m10 = m[c][c1]; m11 = m[c][c];
m21 = m[c+1][c]; m22 = m[c+1][c+1];
m[c][c] = m11*c0*c0 + m22*s0*s0  2*m21*c0*s0;
m[c+1][c+1] = m11*s0*s0 + m22*c0*c0 + 2*m21*c0*s0;
m[c][c1] = m10*c0  z*s0;
m[c+1][c] = (m11m22)*c0*s0 + m21*(c0*c0s0*s0);
if(c<n2) {
m23 = m[c+2][c+1];
m[c+2][c+1] = m23*c0;
z = m23*s0;
}
}
and copy its elements into the upper diagonal at the end of the iteration with
if(c===0 && Math.abs(m[n1][n2])<=e) {
for(cc=0;cc<n1;++cc) m[cc][cc+1] = m[cc+1][cc];
}
as demonstrated by program 7.
Program 7: Ignoring The Upper Diagonal



We can save ourselves a few multiplication operations by calculating the values of \(\sin^2\theta\), \(\cos^2\theta\), \(\cos\theta \times \sin\theta\) and \(2 \times m_{c+1,c} \times \cos\theta \times \sin\theta\) and saving them in the local variables
s0s0
, c0c0
, c0s0
and mcs
with
h = ak.hypot(x, z);
s0 = z/h; s0s0 = s0*s0;
c0 = x/h; c0c0 = c0*c0;
c0s0 = c0*s0;
if(c===0) {
m00 = m[0][0]; m10 = m[1][0];
m11 = m[1][1]; m21 = m[2][1];
mcs = 2*m10*c0s0;
m[0][0] = m00*c0c0 + m11*s0s0  mcs;
m[1][1] = m00*s0s0 + m11*c0c0 + mcs;
m[1][0] = (m00m11)*c0s0 + m10*(c0c0s0s0);
m[2][1] = m21*c0;
z = m21*s0;
}
else {
m10 = m[c][c1]; m11 = m[c][c];
m21 = m[c+1][c]; m22 = m[c+1][c+1];
mcs = 2*m21*c0s0;
m[c][c] = m11*c0c0 + m22*s0s0  mcs;
m[c+1][c+1] = m11*s0s0 + m22*c0c0 + mcs;
m[c][c1] = m10*c0  z*s0;
m[c+1][c] = (m11m22)*c0s0 + m21*(c0c0s0s0);
if(c<n2) {
m23 = m[c+2][c+1];
m[c+2][c+1] = m23*c0;
z = m23*s0;
}
}
as used by program 8.
Program 8: Fewer Multiplication Operations



Whilst this is a reasonably efficient implementation it isn't particularly robust. In particular, if the first row and column of \(\mathbf{M}\) only contain zeros then at the start of the first rotation
z
, and consequently s0
, will equal zero and so z
will still be zero at its end. When the second rotation starts x
will also be set to zero and therefore s0
and c0
will be NaNs which will then propagate through \(\mathbf{M}\), as demonstrated by program 9.
Program 9: Oopsie



Note that the termination criterion has been changed to
if(!(Math.abs(m[n1][n2])>e)) return false;
to ensure that the program stops if the last off diagonal element is NaN.
In such cases we can simply ignore the first row and column, working instead on the submatrix formed from the remaining ones, since they are already diagonal and we aren't going to need to transform them. In fact, we should skip past every initial row and column that are zero on their off diagonals for the same reason.
To do so we simply replace the initial column of zero with one of
o
which we search for at the start of the process with
for(o=0;o<n2 && Math.abs(m[o+1][o])<=e;++o);
although we now also have to take care not to read and write past the end of \(\mathbf{M}\) by changing the first step of each iteration to
if(c===o) {
m00 = m[c][c]; m10 = m[c+1][c]; m11 = m[c+1][c+1];
mcs = 2*m10*c0s0;
m[c][c] = m00*c0c0 + m11*s0s0  mcs;
m[c+1][c+1] = m00*s0s0 + m11*c0c0 + mcs;
m[c+1][c] = (m00m11)*c0s0 + m10*(c0c0s0s0);
if(c<n2) {
m21 = m[c+2][c+1];
m[c+2][c+1] = m21*c0;
z = m21*s0;
}
}
Program 10 shows the results of changing the initial index in this way.
Program 10: Starting At An Offset



In a similar vein, if there's a zero somewhere further along the off diagonal then we should first transform the submatrix in the rows and columns before it and then transform the one in the rows and columns after it, avoiding another source of NaNs. To do this we can replace the size of the matrix
n
with the size of the submatrix nn
that we should be working on by searching for such a zero with
for(nn=o+1;nn<n1 && Math.abs(m[nn+1][nn])>e;++nn);
++nn;
as done in program 11.
Program 11: Stopping At A Zero Off Diagonal



Note that we still have to update all of the rows of \(\mathbf{Q}\), not just those up to the stopping point.
Now, we're currently using the absolute termination criterion
!(Math.abs(m[nn1][nn2])>e)
which is unsuitable if the matrix has large elements. A more reasonable approach is to use
!(Math.abs(m[nn1][nn2])>e*(1+Math.abs(m[nn2][nn2])+Math.abs(m[nn1][nn1])))
which will tend to an absolute comparison when the diagonal elements are small and to a relative comparison to their magnitudes when they are large, as used in program 12.
Program 12: A Better Termination Criterion



Finally, since we're no longer using any elements of \(\mathbf{M}\) beyond the upper and lower diagonals we might as well operate directly on the arrays
d0
and d1
returned by our implementation of the Householder tridiagonalisation algorithm containing its diagonal and off diagonal elements respectively.The
givens
function defined in listing 1 does just that, also bringing the Wilkinson shift inline and returning the offset at the end, or NaN if the loop terminates because an element of \(\mathbf{M}\) is NaN, propagating those on the off diagonal of the latter into the associated elements on its diagonal.Listing 1: The
givens
Functionfunction givens(q, d0, d1, o, e, steps) { var n = q.length; var step = 0; var nn, c, r, r0, r1, qr, qr0, qr1, d01, d02, d12, d, x, z, h; var s0, c0, s0s0, c0c0, c0s0; var m00, m10, m11, m21, m22, m23, mcs; while(o<n2 && Math.abs(d1[o])<=e*(1+Math.abs(d0[o])+Math.abs(d0[o+1]))) ++o; c = o; for(nn=o+1;nn<n1 && Math.abs(d1[nn])>e*(1+Math.abs(d0[nn])+Math.abs(d0[nn+1]));++nn); ++nn; while(c!==o  Math.abs(d1[nn2])>e*(1+Math.abs(d0[nn2])+Math.abs(d0[nn1]))) { if(step++===steps) throw new Error('failure to converge in givens'); if(c===o) { d01 = d0[nn1]; d02 = d0[nn2]; d12 = d1[nn2]; d = (d02d01)/2; x = d>0 ? d0[c]  d01 + d12*d12/(d + ak.hypot(d, d12)) : d0[c]  d01 + d12*d12/(d  ak.hypot(d, d12)); z = d1[c]; } else { x = d1[c1]; } h = ak.hypot(x, z); s0 = z/h; s0s0 = s0*s0; c0 = x/h; c0c0 = c0*c0; c0s0 = c0*s0; if(c===o) { m00 = d0[c]; m10 = d1[c]; m11 = d0[c+1]; mcs = 2*m10*c0s0; d0[c] = m00*c0c0 + m11*s0s0  mcs; d0[c+1] = m00*s0s0 + m11*c0c0 + mcs; d1[c] = (m00m11)*c0s0 + m10*(c0c0s0s0); if(c<nn2) { m21 = d1[c+1]; d1[c+1] = m21*c0; z = m21*s0; } } else { m10 = d1[c1]; m11 = d0[c]; m21 = d1[c]; m22 = d0[c+1]; mcs = 2*m21*c0s0; d0[c] = m11*c0c0 + m22*s0s0  mcs; d0[c+1] = m11*s0s0 + m22*c0c0 + mcs; d1[c1] = m10*c0  z*s0; d1[c] = (m11m22)*c0s0 + m21*(c0c0s0s0); if(c<nn2) { m23 = d1[c+1]; d1[c+1] = m23*c0; z = m23*s0; } } for(r=0;r<n;++r) { qr = q[r]; qr0 = qr[c]; qr1 = qr[c+1]; qr[c] = qr0*c0  qr1*s0; qr[c+1] = qr0*s0 + qr1*c0; } if(++c===nn1) c = o; } if(isNaN(d1[nn2])) d0[nn1] = d0[nn2] = ak.NaN; if(isNaN(d0[nn1])  isNaN(d0[nn2])) return ak.NaN; return o; }
Note that it also sets an upper bound on the number of iterations with the
steps
argument, throwing an error if it is exceeded.Program 13 applies it to a matrix with a similar form to those in programs 11 and 12, having zeros in the first row and column and an off diagonal zero before the end, printing the accumulated rotations above the arrays
d0
and d1
.
Program 13: Using The
givens Function



To diagonalise a symmetric real matrix \(\mathbf{M}\) we repeatedly call the
givens
function with the updated offsets until it reaches the bottom right corner of the matrix, as demonstrated by program 14 which also prints the sum of the magnitudes of the elements of d1
at the end.
Program 14: Diagonalising M



Note that the program terminates when the offset is no longer less than
n2
since this can only happen if we encounter a NaN, if every element of d1
was zero before we called the givens
function or its only nonzero element was its last, which the function will have then transformed to zero.Wilkinson and Reinsch's^{[7]} ALGOL procedure
tred2
, which is probably best known from its translations into FORTRAN^{[8]} and C^{[9]}, goes further a little further than the givens
function by removing many of the local variables and eliminating some of the arithmetic operations used to apply the Givens rotations to \(\mathbf{M}\). Since the accumulation of those rotations in \(\mathbf{Q}\) is an order of magnitude more costly, my preference is to keep them as they are and so we shall be using it as it is when we add the spectral decomposition to the ak
library in the next post.
References
[1] FAO The Householder, www.thusspakeak.com, 2019.[2] Spryer Francis, www.thusspakeak.com, 2019.
[3] Francis, J., The QR Transformation  Part 1, The Computer Journal, Vol. 4, No. 3, Oxford University Press, 1961.
[4] Francis, J., The QR Transformation  Part 2, The Computer Journal, Vol. 4, No. 4, Oxford University Press, 1961.
[5] A Well Managed Household, www.thusspakeak.com, 2020.
[6] Golub, G. and Van Loan, C., Matrix Computations (3rd ed.), John Hopkins University Press, 1996.
[7] Wilkinson, J. & Reinsch, C., Handbook for Automatic Computation, Volume II: Linear Algebra, Springer, 1971
[8] Garbow, Burton S., EISPACK — A package of matrix eigensystem routines, Computer Physics Communications, Vol. 7, 1974
[9] Press, W.H. et al, Numerical Recipes in C (2nd ed.), Cambridge University Press, 1992.
Leave a comment