Over the last few months we have seen how we can use a sequence of Householder transformations^{[1]} followed by a sequence of shifted Givens rotations^{[2]} to efficiently find the spectral decomposition of a symmetric real matrix \(\mathbf{M}\), formed from a matrix \(\mathbf{V}\) and a diagonal matrix \(\boldsymbol{\Lambda}\) satisfying
Note that this includes a check that the we can recover the original matrix from the accumulated transformations \(\mathbf{Q}\) and the tridiagonal matrix \(\mathbf{M}^\prime\) with
Now, since we know that this is equivalent to the matrix multiplications that we have been using so far, we can transform each column of a matrix that is being premultiplied by a Householder matrix, or equivalently each row of a matrix that is being postmultiplied by one, by treating it as the vector \(\mathbf{x}\). From an implementation perspective this is tremendously convenient since we can calculate \(\hat{\mathbf{a}} \times \mathbf{x}\) up front and then apply the transformation inplace.
Specifically, we can use twodimensional arrays to represent \(\mathbf{M}\) and \(\mathbf{Q}\) and replace the
Here the product \(\hat{\mathbf{a}} \times \mathbf{x}\) for the columns and/or rows of \(\mathbf{M}\) and \(\mathbf{Q}\) are being accumulated in
Program 2 demonstrates that this is equivalent to the
For a start, our replacement of the matrix operations in the inner loop is wasting time updating elements of
and remove the code that explicitly sets them to zero
Furthermore, we know that the Householder reflections don't affect elements before row and column
as demonstrated by program 3.
We specifically chose the Householder reflections to set elements beyond the upper and lower diagonals to zero and so it's not really worth calculating their values when we can explicitly set them to zero with
and
which just calculate the values in the upper and lower diagonals, which program 4 shows yields the same results.
Now we also know that
as done by program 5.
The symmetry of
so that we can update
and use them to calculate
removing them from
We no longer need
changing the calculation of the lower diagonal elements of
as demonstrated by program 7.
Now we can calculate the elements of
changing the start of the iteration of
Note that this means that we're applying the Householder transformations from the last to the first and so must switch the rows and columns of
Since we're updating
is equivalent to
since we could define
and so we can use
instead, as shown by program 9.
Given that we're calculating the elements of
and
We then replace
Next we use
copying the diagonal and offdiagonal elements into
We need to copy the bottomright two by two submatrix into
noting that the latter doesn't contain a reflection.
Now the reflections are stored in the row before those that they should be applied to, meaning that we can set the current row to that of an identity matrix before applying them with
Finally we must fill the first row of
which program 10 shows has the same results.
From a computational complexity perspective we are almost done but before we finish there's a numerical stability issue to address. Specifically, the sum
is susceptible to both overflow and underflow. Since both the numerator and denominator in the formula for \(\hat{a}\) are equally scaled if we multiply the elements of the matrix by a constant we are free to do so without changing the result. In particular, we can divide them by the sum of the absolute values of the lower elements a column, given by
If this sum is greater than zero then we update the offdiagonal elements of
as we accumulate
as demonstrated by program 11.
Note that this doesn't apply to the last two by two submatrix since it already contains the correct values.
If
Knowing that every element of the symmetric
The
Program 12 demonstrates that the
It's worth noting at this point that it is more or less equivalent to the
Next time we shall look to similarly improve the implementation of the Givens rotations to yield the spectral decomposition.
[2] Spryer Francis, www.thusspakeak.com, 2019.
[3] Garbow, Burton S., EISPACK — A package of matrix eigensystem routines, Computer Physics Communications, Vol. 7, 1974
[4] Press, W.H. et al, Numerical Recipes in C (2nd ed.), Cambridge University Press, 1992.
\[
\mathbf{M} \times \mathbf{V} = \mathbf{V} \times \boldsymbol{\Lambda}
\]
implying that the columns of \(\mathbf{V}\) are the unit eigenvectors of \(\mathbf{M}\) and their associated elements on the diagonal of \(\boldsymbol{\Lambda}\) are their eigenvalues so that
\[
\mathbf{V} \times \mathbf{V}^\mathrm{T} = \mathbf{I}
\]
where \(\mathbf{I}\) is the identity matrix, and therefore
\[
\mathbf{M} = \mathbf{V} \times \boldsymbol{\Lambda} \times \mathbf{V}^\mathrm{T}
\]
From a mathematical perspective the combination of Householder transformations and shifted Givens rotations is particularly appealing, converging on the spectral decomposition after relatively few matrix multiplications, but from an implementation perspective using ak.matrix
multiplication operations is less than satisfactory since it wastefully creates new ak.matrix
objects at each step and so in this post we shall see how we can start to do better.
Householder Tridiagonalisation
Recall that we use a sequence of Householder transformations \(\mathbf{H}_i\) to reduce a symmetric real matrix \(\mathbf{M}\) to tridiagonal form, meaning that if we define
\[
\mathbf{Q} = \prod_{i=0}^{n3} \mathbf{H}_i
\]
where \(\prod\) is the product sign, then
\[
\mathbf{M}^\prime = \mathbf{Q}^\mathrm{T} \times \mathbf{M} \times \mathbf{Q}
\]
is symmetric and has no nonzero elements above the upper diagonal or below the lower diagonal. Each of those transformations represented a reflection of a vector through a plane and were defined by a symmetric real matrix
\[
\mathbf{H}_i = \mathbf{I}  2 \times \hat{\mathbf{a}} \otimes \hat{\mathbf{a}}
\]
where \(\mathbf{I}\) is the identity matrix whose only nonzero elements are ones on the leading diagonal and \(\otimes\) is the outer product satisfying
\[
\left(\hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right)_{jk} = \hat{a}_j \times \hat{a}_k
\]
Representing the \(i^\mathrm{th}\) column of \(\mathbf{M}\) after the first \(i\) steps as a vector \(\mathbf{x}\), we showed that the elements of \(\hat{\mathbf{a}}\) must be given by
\[
\hat{a}_j
=
\begin{cases}
0 & j < i+1\\
\\
\dfrac{x_j  s}{\left(2 \times s \times \left(s  x_{i+1}\right)\right)^\frac12} & j = i+1\\
\\
\dfrac{x_j}{\left(2 \times s \times \left(s  x_{i+1}\right)\right)^\frac12} & j > i+1
\end{cases}
\]
where
\[
s =
\begin{cases}
+\left( \sum_{j=i+1}^{n1} \; x_j^2\right)^\frac12 & x_{i+1} \leqslant 0\\
\left( \sum_{j=i+1}^{n1} \; x_j^2\right)^\frac12 & x_{i+1} > 0
\end{cases}
\]
Program 1 demonstrates that this works by writing out the accumulated transformations above the transformed matrix at each step.
Program 1: Householder Transformations



Note that this includes a check that the we can recover the original matrix from the accumulated transformations \(\mathbf{Q}\) and the tridiagonal matrix \(\mathbf{M}^\prime\) with
\[
\mathbf{M} = \mathbf{Q} \times \mathbf{M}^\prime \times \mathbf{Q}^\mathrm{T}
\]
by printing the distance between the matrices on the left and right hand sides of the equals sign at the end, defined as the square root of the sum of the squared differences between their elements and calculated using ak.dist
.
A Time For Reflection
Recall that the reason why the Householder matrices represent reflections is because, for a vector \(\mathbf{x}\), we have
\[
\left(\mathbf{I}  2 \times \hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right) \times \mathbf{x}
= \mathbf{x}  2 \times \left(\hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right) \times \mathbf{x}
= \mathbf{x}  2 \times \hat{\mathbf{a}} \times \left(\hat{\mathbf{a}} \times \mathbf{x}\right)
\]
and that, since the last term represents the length of the projection of \(\mathbf{x}\) onto \(\hat{\mathbf{a}}\), this is the reflection of the former through the plane, or technically the multidimensional hyperplane, that the latter passes through at right angles, or technically orthogonally.Now, since we know that this is equivalent to the matrix multiplications that we have been using so far, we can transform each column of a matrix that is being premultiplied by a Householder matrix, or equivalently each row of a matrix that is being postmultiplied by one, by treating it as the vector \(\mathbf{x}\). From an implementation perspective this is tremendously convenient since we can calculate \(\hat{\mathbf{a}} \times \mathbf{x}\) up front and then apply the transformation inplace.
Specifically, we can use twodimensional arrays to represent \(\mathbf{M}\) and \(\mathbf{Q}\) and replace the
ak.matrix
operations in the inner loop of program 1 with
for(cc=0;cc<n;++cc) {
p = 0;
for(rr=0;rr<n;++rr) p += a[rr]*m[rr][cc];
for(rr=0;rr<n;++rr) m[rr][cc] = 2*a[rr]*p;
}
for(rr=0;rr<n;++rr) {
p = 0;
for(cc=0;cc<n;++cc) p += m[rr][cc]*a[cc];
for(cc=0;cc<n;++cc) m[rr][cc] = 2*a[cc]*p;
}
for(rr=0;rr<n;++rr) {
p = 0;
for(cc=0;cc<n;++cc) p += q[rr][cc]*a[cc];
for(cc=0;cc<n;++cc) q[rr][cc] = 2*a[cc]*p;
}
Here the product \(\hat{\mathbf{a}} \times \mathbf{x}\) for the columns and/or rows of \(\mathbf{M}\) and \(\mathbf{Q}\) are being accumulated in
p
, after which we directly store the Householder reflections of them in their arrays.Program 2 demonstrates that this is equivalent to the
ak.matrix
based implementation of program 1 using the same final check of correctness.
Program 2: InPlace Householder Transformations



A Time For Refactoring
Whilst this is certainly an improvement it is by no means optimal and so it is well worth spending some time tuning the implementation.For a start, our replacement of the matrix operations in the inner loop is wasting time updating elements of
m
and q
that we know won't change. Specifically, since every element of a
up to a[c]
equals zero and there's no point in including them in the accumulation of p
and so we start it from c+1
with
for(cc=0;cc<n;++cc) {
p = 0;
for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][cc];
for(rr=c+1;rr<n;++rr) m[rr][cc] = 2*a[rr]*p;
}
for(rr=0;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += m[rr][cc]*a[cc];
for(cc=c+1;cc<n;++cc) m[rr][cc] = 2*a[cc]*p;
}
for(rr=0;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[rr][cc]*a[cc];
for(cc=c+1;cc<n;++cc) q[rr][cc] = 2*a[cc]*p;
}
and remove the code that explicitly sets them to zero
for(r=0;r<c+1;++r) a[r] = 0;
Furthermore, we know that the Householder reflections don't affect elements before row and column
c
in m
or in the first row of q
and so we can remove them from the the loops by changing their starting values to
for(cc=c;cc<n;++cc) {
p = 0;
for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][cc];
for(rr=c+1;rr<n;++rr) m[rr][cc] = 2*a[rr]*p;
}
for(rr=c;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += m[rr][cc]*a[cc];
for(cc=c+1;cc<n;++cc) m[rr][cc] = 2*a[cc]*p;
}
for(rr=1;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[rr][cc]*a[cc];
for(cc=c+1;cc<n;++cc) q[rr][cc] = 2*a[cc]*p;
}
as demonstrated by program 3.
Program 3: Updated Inner Loops



We specifically chose the Householder reflections to set elements beyond the upper and lower diagonals to zero and so it's not really worth calculating their values when we can explicitly set them to zero with
p = 0;
for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][c];
m[c+1][c] = 2*a[c+1]*p;
for(rr=c+2;rr<n;++rr) m[rr][c] = 0;
and
p = 0;
for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc];
m[c][c+1] = 2*a[c+1]*p;
for(cc=c+2;cc<n;++cc) m[c][cc] = 0;
which just calculate the values in the upper and lower diagonals, which program 4 shows yields the same results.
Program 4: Known Zeros In The Reflections



Now we also know that
m
is symmetric at every step and so we can merge those two operations into one with
p = 0;
for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc];
m[c+1][c] = m[c][c+1] = 2*a[c+1]*p;
for(cc=c+2;cc<n;++cc) m[cc][c] = m[c][cc] = 0;
as done by program 5.
Program 5: Current Row And Column Symmetry



The symmetry of
m
means that we don't really need the values of its elements both above and below the leading diagonal and so we can save the nonzero elements of a
in the upper part of m
with
p = 0;
for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc];
m[c+1][c] = 2*a[c+1]*p;
for(cc=c+1;cc<n;++cc) m[c][cc] = a[cc];
so that we can update
m
with
for(cc=c+1;cc<n;++cc) {
p = 0;
for(rr=c+1;rr<n;++rr) p += m[c][rr]*m[rr][cc];
for(rr=c+1;rr<n;++rr) m[rr][cc] = 2*m[c][rr]*p;
}
for(rr=c+1;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += m[rr][cc]*m[c][cc];
for(cc=c+1;cc<n;++cc) m[rr][cc] = 2*m[c][cc]*p;
}
and use them to calculate
q
after we've finished calculating m
with
for(rr=1;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[rr][cc]*m[c][cc];
for(cc=c+1;cc<n;++cc) q[rr][cc] = 2*m[c][cc]*p;
}
m[c][c+1] = m[c+1][c];
for(cc=c+2;cc<n;++cc) m[cc][c] = m[c][cc] = 0;
removing them from
m
as we go, as shown by program 6.
Program 6: Save Reflections In
m



We no longer need
a
to store the reflections since we can put them directly into the upper triangle of m
with
s = 0;
for(r=c+1;r<n;++r) s += m[r][c]*m[r][c];
s = Math.sqrt(s);
if(m[c+1][c]>0) s = s;
d = Math.sqrt(2*s*(sm[c+1][c]));
m[c][c+1] = s;
for(cc=c+1;cc<n;++cc) m[c][cc] /= d;
changing the calculation of the lower diagonal elements of
m
to
p = 0;
for(cc=c+1;cc<n;++cc) p += m[cc][c]*m[c][cc];
m[c+1][c] = 2*m[c][c+1]*p;
as demonstrated by program 7.
Program 7: Write Reflections To
m



Now we can calculate the elements of
q
in reverse order by finally iterating c
backwards from n3
to zero and using
for(rr=c+1;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[cc][rr]*m[c][cc];
for(cc=c+1;cc<n;++cc) q[cc][rr] = 2*m[c][cc]*p;
}
changing the start of the iteration of
rr
because the upper elements of q
equal zero before that.Note that this means that we're applying the Householder transformations from the last to the first and so must switch the rows and columns of
q
to transpose it, swapping their order, which program 8 shows yields the same results.
Program 8: Calculating
q In Reverse



Since we're updating
q
backward we know that the rows and columns that we've yet to calculate are those of an identity matrix, which means that the elements on their leading diagonals equal one and the others equal zero. We can consequently save ourselves a little work by not bothering to accumulate p
for the first row by noting that
p = 0;
for(cc=c+1;cc<n;++cc) p += q[cc][c+1]*m[c][cc];
for(cc=c+1;cc<n;++cc) q[cc][c+1] = 2*m[c][cc]*p;
is equivalent to
for(cc=c+1;cc<n;++cc) q[cc][c+1] = 2*m[c][cc]*m[c][c+1];
since we could define
p
with
p = q[c+1][c+1]*m[c][c+1]
and so we can use
for(cc=c+1;cc<n;++cc) q[cc][c+1] = 2*m[c][cc]*m[c][c+1];
for(rr=c+2;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[cc][rr]*m[c][cc];
for(cc=c+1;cc<n;++cc) q[cc][rr] = 2*m[c][cc]*p;
}
instead, as shown by program 9.
Program 9: Known Zeros In
q



Given that we're calculating the elements of
q
from the bottom right we can use its upper elements to store the Householder transformations provided we clear them with rows of an identity matrix after we're done with them. We start by setting q
to an array of the elements of the matrix with
q = m0.toArray();
and
m
to the elements of a matrix of zeros
var m = ak.matrix(n,n,0).toArray();
We then replace
m
with q
in the calculation of the reflections
s = 0;
for(r=c+1;r<n;++r) s += q[r][c]*q[r][c];
s = Math.sqrt(s);
if(q[c+1][c]>0) s = s;
d = Math.sqrt(2*s*(sq[c+1][c]));
q[c][c+1] = s;
for(cc=c+1;cc<n;++cc) q[c][cc] /= d;
Next we use
q
instead of m
in the inner loops
p = 0;
for(cc=c+1;cc<n;++cc) p += q[cc][c]*q[c][cc];
q[c+1][c] = 2*q[c][c+1]*p;
for(cc=c+1;cc<n;++cc) {
p = 0;
for(rr=c+1;rr<n;++rr) p += q[c][rr]*q[rr][cc];
for(rr=c+1;rr<n;++rr) q[rr][cc] = 2*q[c][rr]*p;
}
for(rr=c+1;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[rr][cc]*q[c][cc];
for(cc=c+1;cc<n;++cc) q[rr][cc] = 2*q[c][cc]*p;
}
copying the diagonal and offdiagonal elements into
m
with
m[c][c] = q[c][c];
m[c][c+1] = m[c+1][c] = q[c+1][c];
We need to copy the bottomright two by two submatrix into
m
at the last step and then set the last row of q
to that of an identity matrix which we can do with
if(c===n3) {
m[n2][n2] = q[n2][n2]; m[n1][n1] = q[n1][n1];
m[n2][n1] = m[n1][n2] = q[n1][n2];
for(cc=0;cc<n1;++cc) q[n1][cc] = 0;
q[n1][n1] = 1;
}
noting that the latter doesn't contain a reflection.
Now the reflections are stored in the row before those that they should be applied to, meaning that we can set the current row to that of an identity matrix before applying them with
for(cc=0;cc<n;++cc) q[c+1][cc] = 0;
q[c+1][c+1] = 1;
for(cc=c+1;cc<n;++cc) q[cc][c+1] = 2*q[c][cc]*q[c][c+1];
for(rr=c+2;rr<n;++rr) {
p = 0;
for(cc=c+1;cc<n;++cc) p += q[cc][rr]*q[c][cc];
for(cc=c+1;cc<n;++cc) q[cc][rr] = 2*q[c][cc]*p;
}
Finally we must fill the first row of
q
with that of an identity matrix using
if(c===0) {
for(cc=1;cc<n;++cc) q[0][cc] = 0;
q[0][0] = 1;
}
which program 10 shows has the same results.
Program 10: Switching
m And q



From a computational complexity perspective we are almost done but before we finish there's a numerical stability issue to address. Specifically, the sum
s = 0;
for(r=c+1;r<n;++r) s += q[r][c]*q[r][c];
is susceptible to both overflow and underflow. Since both the numerator and denominator in the formula for \(\hat{a}\) are equally scaled if we multiply the elements of the matrix by a constant we are free to do so without changing the result. In particular, we can divide them by the sum of the absolute values of the lower elements a column, given by
z = 0;
for(r=c+1;r<n;++r) z += Math.abs(q[r][c]);
If this sum is greater than zero then we update the offdiagonal elements of
q
with
s = 0;
for(r=c+1;r<n;++r) {
q[c][r] = q[r][c] /= z;
s += q[r][c]*q[r][c];
}
as we accumulate
s
. This means that we need to multiply the offdiagonal elements of q
by z
as we calculate m
with
m[c][c+1] = m[c+1][c] = q[c+1][c]*z;
as demonstrated by program 11.
Program 11: Numerical Considerations



Note that this doesn't apply to the last two by two submatrix since it already contains the correct values.
If
z
equals zero then the row and column are diagonal and we don't need to do anything so simply add one to it before copying the elements to m
to ensure that they're not scaled. If it's NaN then adding one leaves it as it is and so we put NaNs in m
.Knowing that every element of the symmetric
m
above and below the offdiagonals of its final form are zero means that it is far less wasteful to keep them in a pair of one dimensional arrays rather than a two dimensional array that is mostly filled with zeros.The
householder
function defined in listing 1 puts all of this together, saving rows of q
in variables whilst looping over them and exploiting its symmetry to use the elements in its rows in place of those in its columns where possible, returning an object containing the matrix of Householder transformations in a two dimensional array q
, the leading diagonal elements of the transformed matrix in an array d0
and its offdiagonal elements in d1
.Listing 1: The
householder
Functionfunction householder(q) { var n = q.length; var d0 = new Array(n); var d1 = new Array(n1); var c, s, r, d, p, rr, cc, z, qc, qrr, q0; for(c=0;c<n2;++c) { z = 0; qc = q[c]; for(r=c+1;r<n;++r) z += Math.abs(qc[r]); if(z>0) { s = 0; for(r=c+1;r<n;++r) { qc[r] = q[r][c] /= z; s += qc[r]*qc[r]; } s = Math.sqrt(s); if(qc[c+1]>0) s = s; d = Math.sqrt(2*s*(sqc[c+1])); qc[c+1] = s; for(cc=c+1;cc<n;++cc) qc[cc] /= d; p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][c]*qc[cc]; q[c+1][c] = 2*qc[c+1]*p; for(cc=c+1;cc<n;++cc) { p = 0; for(rr=c+1;rr<n;++rr) p += qc[rr]*q[rr][cc]; for(rr=c+1;rr<n;++rr) q[rr][cc] = 2*qc[rr]*p; } for(rr=c+1;rr<n;++rr) { qrr = q[rr]; p = 0; for(cc=c+1;cc<n;++cc) p += qrr[cc]*qc[cc]; for(cc=c+1;cc<n;++cc) qrr[cc] = 2*qc[cc]*p; } } else { z += 1; } d0[c] = qc[c]; d1[c] = q[c+1][c]*z; } d0[n2] = q[n2][n2]; d0[n1] = q[n1][n1]; d1[n2] = q[n1][n2]; for(cc=0;cc<n1;++cc) q[n1][cc] = 0; q[n1][n1] = 1; for(c=n3;c>=0;c) { for(cc=0;cc<n;++cc) q[c+1][cc] = 0; q[c+1][c+1] = 1; for(cc=c+1;cc<n;++cc) q[cc][c+1] = 2*q[c][cc]*q[c][c+1]; for(rr=c+2;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][rr]*q[c][cc]; for(cc=c+1;cc<n;++cc) q[cc][rr] = 2*q[c][cc]*p; } } q0 = q[0]; for(cc=1;cc<n;++cc) q0[cc] = 0; q0[0] = 1; return { q: q, d0: d0, d1: d1 }; }
Program 12 demonstrates that the
householder
function behaves as expected, creating a tridiagonal matrix t
from the elements of d0
and d1
.
Program 12: Using The
householder Function



It's worth noting at this point that it is more or less equivalent to the
tred2
subroutine in EISPACK^{[3]} which I suspect that most people, like me, first came across in its translation to C in Numerical Recipes^{[4]} and which I have used for guidance whilst refactoring.Next time we shall look to similarly improve the implementation of the Givens rotations to yield the spectral decomposition.
References
[1] FAO The Householder, www.thusspakeak.com, 2019.[2] Spryer Francis, www.thusspakeak.com, 2019.
[3] Garbow, Burton S., EISPACK — A package of matrix eigensystem routines, Computer Physics Communications, Vol. 7, 1974
[4] Press, W.H. et al, Numerical Recipes in C (2nd ed.), Cambridge University Press, 1992.
Leave a comment