A Well Managed Household

| 0 Comments

Over the last few months we have seen how we can use a sequence of Householder transformations[1] followed by a sequence of shifted Givens rotations[2] to efficiently find the spectral decomposition of a symmetric real matrix \(\mathbf{M}\), formed from a matrix \(\mathbf{V}\) and a diagonal matrix \(\boldsymbol{\Lambda}\) satisfying
\[ \mathbf{M} \times \mathbf{V} = \mathbf{V} \times \boldsymbol{\Lambda} \]
implying that the columns of \(\mathbf{V}\) are the unit eigenvectors of \(\mathbf{M}\) and their associated elements on the diagonal of \(\boldsymbol{\Lambda}\) are their eigenvalues so that
\[ \mathbf{V} \times \mathbf{V}^\mathrm{T} = \mathbf{I} \]
where \(\mathbf{I}\) is the identity matrix, and therefore
\[ \mathbf{M} = \mathbf{V} \times \boldsymbol{\Lambda} \times \mathbf{V}^\mathrm{T} \]
From a mathematical perspective the combination of Householder transformations and shifted Givens rotations is particularly appealing, converging on the spectral decomposition after relatively few matrix multiplications, but from an implementation perspective using ak.matrix multiplication operations is less than satisfactory since it wastefully creates new ak.matrix objects at each step and so in this post we shall see how we can start to do better.

Householder Tridiagonalisation

Recall that we use a sequence of Householder transformations \(\mathbf{H}_i\) to reduce a symmetric real matrix \(\mathbf{M}\) to tridiagonal form, meaning that if we define
\[ \mathbf{Q} = \prod_{i=0}^{n-3} \mathbf{H}_i \]
where \(\prod\) is the product sign, then
\[ \mathbf{M}^\prime = \mathbf{Q}^\mathrm{T} \times \mathbf{M} \times \mathbf{Q} \]
is symmetric and has no non-zero elements above the upper diagonal or below the lower diagonal. Each of those transformations represented a reflection of a vector through a plane and were defined by a symmetric real matrix
\[ \mathbf{H}_i = \mathbf{I} - 2 \times \hat{\mathbf{a}} \otimes \hat{\mathbf{a}} \]
where \(\mathbf{I}\) is the identity matrix whose only non-zero elements are ones on the leading diagonal and \(\otimes\) is the outer product satisfying
\[ \left(\hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right)_{jk} = \hat{a}_j \times \hat{a}_k \]
Representing the \(i^\mathrm{th}\) column of \(\mathbf{M}\) after the first \(i\) steps as a vector \(\mathbf{x}\), we showed that the elements of \(\hat{\mathbf{a}}\) must be given by
\[ \hat{a}_j = \begin{cases} 0 & j < i+1\\ \\ \dfrac{x_j - s}{\left(2 \times s \times \left(s - x_{i+1}\right)\right)^\frac12} & j = i+1\\ \\ \dfrac{x_j}{\left(2 \times s \times \left(s - x_{i+1}\right)\right)^\frac12} & j > i+1 \end{cases} \]
where
\[ s = \begin{cases} +\left( \sum_{j=i+1}^{n-1} \; x_j^2\right)^\frac12 & x_{i+1} \leqslant 0\\ -\left( \sum_{j=i+1}^{n-1} \; x_j^2\right)^\frac12 & x_{i+1} > 0 \end{cases} \]
Program 1 demonstrates that this works by writing out the accumulated transformations above the transformed matrix at each step.

Program 1: Householder Transformations

Note that this includes a check that the we can recover the original matrix from the accumulated transformations \(\mathbf{Q}\) and the tridiagonal matrix \(\mathbf{M}^\prime\) with
\[ \mathbf{M} = \mathbf{Q} \times \mathbf{M}^\prime \times \mathbf{Q}^\mathrm{T} \]
by printing the distance between the matrices on the left and right hand sides of the equals sign at the end, defined as the square root of the sum of the squared differences between their elements and calculated using ak.dist.

A Time For Reflection

Recall that the reason why the Householder matrices represent reflections is because, for a vector \(\mathbf{x}\), we have
\[ \left(\mathbf{I} - 2 \times \hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right) \times \mathbf{x} = \mathbf{x} - 2 \times \left(\hat{\mathbf{a}} \otimes \hat{\mathbf{a}}\right) \times \mathbf{x} = \mathbf{x} - 2 \times \hat{\mathbf{a}} \times \left(\hat{\mathbf{a}} \times \mathbf{x}\right) \]
and that, since the last term represents the length of the projection of \(\mathbf{x}\) onto \(\hat{\mathbf{a}}\), this is the reflection of the former through the plane, or technically the multidimensional hyperplane, that the latter passes through at right angles, or technically orthogonally.
Now, since we know that this is equivalent to the matrix multiplications that we have been using so far, we can transform each column of a matrix that is being pre-multiplied by a Householder matrix, or equivalently each row of a matrix that is being post-multiplied by one, by treating it as the vector \(\mathbf{x}\). From an implementation perspective this is tremendously convenient since we can calculate \(\hat{\mathbf{a}} \times \mathbf{x}\) up front and then apply the transformation in-place.
Specifically, we can use two-dimensional arrays to represent \(\mathbf{M}\) and \(\mathbf{Q}\) and replace the ak.matrix operations in the inner loop of program 1 with
for(cc=0;cc<n;++cc) { p = 0; for(rr=0;rr<n;++rr) p += a[rr]*m[rr][cc]; for(rr=0;rr<n;++rr) m[rr][cc] -= 2*a[rr]*p; } for(rr=0;rr<n;++rr) { p = 0; for(cc=0;cc<n;++cc) p += m[rr][cc]*a[cc]; for(cc=0;cc<n;++cc) m[rr][cc] -= 2*a[cc]*p; } for(rr=0;rr<n;++rr) { p = 0; for(cc=0;cc<n;++cc) p += q[rr][cc]*a[cc]; for(cc=0;cc<n;++cc) q[rr][cc] -= 2*a[cc]*p; }

Here the product \(\hat{\mathbf{a}} \times \mathbf{x}\) for the columns and/or rows of \(\mathbf{M}\) and \(\mathbf{Q}\) are being accumulated in p, after which we directly store the Householder reflections of them in their arrays.

Program 2 demonstrates that this is equivalent to the ak.matrix based implementation of program 1 using the same final check of correctness.

Program 2: In-Place Householder Transformations

A Time For Refactoring

Whilst this is certainly an improvement it is by no means optimal and so it is well worth spending some time tuning the implementation.

For a start, our replacement of the matrix operations in the inner loop is wasting time updating elements of m and q that we know won't change. Specifically, since every element of a up to a[c] equals zero and there's no point in including them in the accumulation of p and so we start it from c+1 with
for(cc=0;cc<n;++cc) { p = 0; for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][cc]; for(rr=c+1;rr<n;++rr) m[rr][cc] -= 2*a[rr]*p; } for(rr=0;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += m[rr][cc]*a[cc]; for(cc=c+1;cc<n;++cc) m[rr][cc] -= 2*a[cc]*p; } for(rr=0;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[rr][cc]*a[cc]; for(cc=c+1;cc<n;++cc) q[rr][cc] -= 2*a[cc]*p; }

and remove the code that explicitly sets them to zero
for(r=0;r<c+1;++r) a[r] = 0;

Furthermore, we know that the Householder reflections don't affect elements before row and column c in m or in the first row of q and so we can remove them from the the loops by changing their starting values to
for(cc=c;cc<n;++cc) { p = 0; for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][cc]; for(rr=c+1;rr<n;++rr) m[rr][cc] -= 2*a[rr]*p; } for(rr=c;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += m[rr][cc]*a[cc]; for(cc=c+1;cc<n;++cc) m[rr][cc] -= 2*a[cc]*p; } for(rr=1;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[rr][cc]*a[cc]; for(cc=c+1;cc<n;++cc) q[rr][cc] -= 2*a[cc]*p; }

as demonstrated by program 3.

Program 3: Updated Inner Loops

We specifically chose the Householder reflections to set elements beyond the upper and lower diagonals to zero and so it's not really worth calculating their values when we can explicitly set them to zero with
p = 0; for(rr=c+1;rr<n;++rr) p += a[rr]*m[rr][c]; m[c+1][c] -= 2*a[c+1]*p; for(rr=c+2;rr<n;++rr) m[rr][c] = 0;

and
p = 0; for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc]; m[c][c+1] -= 2*a[c+1]*p; for(cc=c+2;cc<n;++cc) m[c][cc] = 0;

which just calculate the values in the upper and lower diagonals, which program 4 shows yields the same results.

Program 4: Known Zeros In The Reflections

Now we also know that m is symmetric at every step and so we can merge those two operations into one with
p = 0; for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc]; m[c+1][c] = m[c][c+1] -= 2*a[c+1]*p; for(cc=c+2;cc<n;++cc) m[cc][c] = m[c][cc] = 0;

as done by program 5.

Program 5: Current Row And Column Symmetry

The symmetry of m means that we don't really need the values of its elements both above and below the leading diagonal and so we can save the non-zero elements of a in the upper part of m with
p = 0; for(cc=c+1;cc<n;++cc) p += m[c][cc]*a[cc]; m[c+1][c] -= 2*a[c+1]*p; for(cc=c+1;cc<n;++cc) m[c][cc] = a[cc];

so that we can update m with
for(cc=c+1;cc<n;++cc) { p = 0; for(rr=c+1;rr<n;++rr) p += m[c][rr]*m[rr][cc]; for(rr=c+1;rr<n;++rr) m[rr][cc] -= 2*m[c][rr]*p; } for(rr=c+1;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += m[rr][cc]*m[c][cc]; for(cc=c+1;cc<n;++cc) m[rr][cc] -= 2*m[c][cc]*p; }

and use them to calculate q after we've finished calculating m with
for(rr=1;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[rr][cc]*m[c][cc]; for(cc=c+1;cc<n;++cc) q[rr][cc] -= 2*m[c][cc]*p; } m[c][c+1] = m[c+1][c]; for(cc=c+2;cc<n;++cc) m[cc][c] = m[c][cc] = 0;

removing them from m as we go, as shown by program 6.

Program 6: Save Reflections In m

We no longer need a to store the reflections since we can put them directly into the upper triangle of m with
s = 0; for(r=c+1;r<n;++r) s += m[r][c]*m[r][c]; s = Math.sqrt(s); if(m[c+1][c]>0) s = -s; d = Math.sqrt(2*s*(s-m[c+1][c])); m[c][c+1] -= s; for(cc=c+1;cc<n;++cc) m[c][cc] /= d;

changing the calculation of the lower diagonal elements of m to
p = 0; for(cc=c+1;cc<n;++cc) p += m[cc][c]*m[c][cc]; m[c+1][c] -= 2*m[c][c+1]*p;

as demonstrated by program 7.

Program 7: Write Reflections To m

Now we can calculate the elements of q in reverse order by finally iterating c backwards from n-3 to zero and using
for(rr=c+1;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][rr]*m[c][cc]; for(cc=c+1;cc<n;++cc) q[cc][rr] -= 2*m[c][cc]*p; }

changing the start of the iteration of rr because the upper elements of q equal zero before that.
Note that this means that we're applying the Householder transformations from the last to the first and so must switch the rows and columns of q to transpose it, swapping their order, which program 8 shows yields the same results.

Program 8: Calculating q In Reverse

Since we're updating q backward we know that the rows and columns that we've yet to calculate are those of an identity matrix, which means that the elements on their leading diagonals equal one and the others equal zero. We can consequently save ourselves a little work by not bothering to accumulate p for the first row by noting that
p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][c+1]*m[c][cc]; for(cc=c+1;cc<n;++cc) q[cc][c+1] -= 2*m[c][cc]*p;

is equivalent to
for(cc=c+1;cc<n;++cc) q[cc][c+1] -= 2*m[c][cc]*m[c][c+1];

since we could define p with
p = q[c+1][c+1]*m[c][c+1]

and so we can use
for(cc=c+1;cc<n;++cc) q[cc][c+1] -= 2*m[c][cc]*m[c][c+1]; for(rr=c+2;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][rr]*m[c][cc]; for(cc=c+1;cc<n;++cc) q[cc][rr] -= 2*m[c][cc]*p; }

instead, as shown by program 9.

Program 9: Known Zeros In q

Given that we're calculating the elements of q from the bottom right we can use its upper elements to store the Householder transformations provided we clear them with rows of an identity matrix after we're done with them. We start by setting q to an array of the elements of the matrix with
q = m0.toArray();

and m to the elements of a matrix of zeros
var m = ak.matrix(n,n,0).toArray();

We then replace m with q in the calculation of the reflections
s = 0; for(r=c+1;r<n;++r) s += q[r][c]*q[r][c]; s = Math.sqrt(s); if(q[c+1][c]>0) s = -s; d = Math.sqrt(2*s*(s-q[c+1][c])); q[c][c+1] -= s; for(cc=c+1;cc<n;++cc) q[c][cc] /= d;

Next we use q instead of m in the inner loops
p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][c]*q[c][cc]; q[c+1][c] -= 2*q[c][c+1]*p; for(cc=c+1;cc<n;++cc) { p = 0; for(rr=c+1;rr<n;++rr) p += q[c][rr]*q[rr][cc]; for(rr=c+1;rr<n;++rr) q[rr][cc] -= 2*q[c][rr]*p; } for(rr=c+1;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[rr][cc]*q[c][cc]; for(cc=c+1;cc<n;++cc) q[rr][cc] -= 2*q[c][cc]*p; }

copying the diagonal and off-diagonal elements into m with
m[c][c] = q[c][c]; m[c][c+1] = m[c+1][c] = q[c+1][c];

We need to copy the bottom-right two by two sub-matrix into m at the last step and then set the last row of q to that of an identity matrix which we can do with
if(c===n-3) { m[n-2][n-2] = q[n-2][n-2]; m[n-1][n-1] = q[n-1][n-1]; m[n-2][n-1] = m[n-1][n-2] = q[n-1][n-2]; for(cc=0;cc<n-1;++cc) q[n-1][cc] = 0; q[n-1][n-1] = 1; }

noting that the latter doesn't contain a reflection.
Now the reflections are stored in the row before those that they should be applied to, meaning that we can set the current row to that of an identity matrix before applying them with
for(cc=0;cc<n;++cc) q[c+1][cc] = 0; q[c+1][c+1] = 1; for(cc=c+1;cc<n;++cc) q[cc][c+1] -= 2*q[c][cc]*q[c][c+1]; for(rr=c+2;rr<n;++rr) { p = 0; for(cc=c+1;cc<n;++cc) p += q[cc][rr]*q[c][cc]; for(cc=c+1;cc<n;++cc) q[cc][rr] -= 2*q[c][cc]*p; }

Finally we must fill the first row of q with that of an identity matrix using
if(c===0) { for(cc=1;cc<n;++cc) q[0][cc] = 0; q[0][0] = 1; }

which program 10 shows has the same results.

Program 10: Switching m And q

From a computational complexity perspective we are almost done but before we finish there's a numerical stability issue to address. Specifically, the sum
s = 0; for(r=c+1;r<n;++r) s += q[r][c]*q[r][c];

is susceptible to both overflow and underflow. Since both the numerator and denominator in the formula for \(\hat{a}\) are equally scaled if we multiply the elements of the matrix by a constant we are free to do so without changing the result. In particular, we can divide them by the sum of the absolute values of the lower elements a column, given by
z = 0; for(r=c+1;r<n;++r) z += Math.abs(q[r][c]);

If this sum is greater than zero then we update the off-diagonal elements of q with
s = 0; for(r=c+1;r<n;++r) { q[c][r] = q[r][c] /= z; s += q[r][c]*q[r][c]; }

as we accumulate s. This means that we need to multiply the off-diagonal elements of q by z as we calculate m with
m[c][c+1] = m[c+1][c] = q[c+1][c]*z;

as demonstrated by program 11.

Program 11: Numerical Considerations

Note that this doesn't apply to the last two by two sub-matrix since it already contains the correct values.
If z equals zero then the row and column are diagonal and we don't need to do anything so simply add one to it before copying the elements to m to ensure that they're not scaled. If it's NaN then adding one leaves it as it is and so we put NaNs in m.

Knowing that every element of the symmetric m above and below the off-diagonals of its final form are zero means that it is far less wasteful to keep them in a pair of one dimensional arrays rather than a two dimensional array that is mostly filled with zeros.
The householder function defined in listing 1 puts all of this together, saving rows of q in variables whilst looping over them and exploiting its symmetry to use the elements in its rows in place of those in its columns where possible, returning an object containing the matrix of Householder transformations in a two dimensional array q, the leading diagonal elements of the transformed matrix in an array d0 and its off-diagonal elements in d1.

Listing 1: The householder Function
function householder(q) {
 var n = q.length;
 var d0 = new Array(n);
 var d1 = new Array(n-1);
 var c, s, r, d, p, rr, cc, z, qc, qrr, q0;

 for(c=0;c<n-2;++c) {
  z = 0;
  qc = q[c];
  for(r=c+1;r<n;++r) z += Math.abs(qc[r]);

  if(z>0) {
   s = 0;
   for(r=c+1;r<n;++r) {
    qc[r] = q[r][c] /= z;
    s += qc[r]*qc[r];
   }
   s = Math.sqrt(s);
   if(qc[c+1]>0) s = -s;
   d = Math.sqrt(2*s*(s-qc[c+1]));
   qc[c+1] -= s;
   for(cc=c+1;cc<n;++cc) qc[cc] /= d;

   p = 0;
   for(cc=c+1;cc<n;++cc) p += q[cc][c]*qc[cc];
   q[c+1][c] -= 2*qc[c+1]*p;

   for(cc=c+1;cc<n;++cc) {
    p = 0;
    for(rr=c+1;rr<n;++rr) p += qc[rr]*q[rr][cc];
    for(rr=c+1;rr<n;++rr) q[rr][cc] -= 2*qc[rr]*p;
   }

   for(rr=c+1;rr<n;++rr) {
    qrr = q[rr];
    p = 0;
    for(cc=c+1;cc<n;++cc) p += qrr[cc]*qc[cc];
    for(cc=c+1;cc<n;++cc) qrr[cc] -= 2*qc[cc]*p;
   }
  }
  else {
   z += 1;
  }

  d0[c] = qc[c];
  d1[c] = q[c+1][c]*z;
 }

 d0[n-2] = q[n-2][n-2];
 d0[n-1] = q[n-1][n-1];
 d1[n-2] = q[n-1][n-2];

 for(cc=0;cc<n-1;++cc) q[n-1][cc] = 0;
 q[n-1][n-1] = 1;

 for(c=n-3;c>=0;--c) {
  for(cc=0;cc<n;++cc) q[c+1][cc] = 0;
  q[c+1][c+1] = 1;

  for(cc=c+1;cc<n;++cc) q[cc][c+1] -= 2*q[c][cc]*q[c][c+1];

  for(rr=c+2;rr<n;++rr) {
   p = 0;
   for(cc=c+1;cc<n;++cc) p += q[cc][rr]*q[c][cc];
   for(cc=c+1;cc<n;++cc) q[cc][rr] -= 2*q[c][cc]*p;
  }
 }
 q0 = q[0];
 for(cc=1;cc<n;++cc) q0[cc] = 0;
 q0[0] = 1;

 return {
  q:  q,
  d0: d0,
  d1: d1
 };
}

Program 12 demonstrates that the householder function behaves as expected, creating a tridiagonal matrix t from the elements of d0 and d1.

Program 12: Using The householder Function

It's worth noting at this point that it is more or less equivalent to the tred2 subroutine in EISPACK[3] which I suspect that most people, like me, first came across in its translation to C in Numerical Recipes[4] and which I have used for guidance whilst refactoring.
Next time we shall look to similarly improve the implementation of the Givens rotations to yield the spectral decomposition.

References

[1] FAO The Householder, www.thusspakeak.com, 2019.

[2] Spryer Francis, www.thusspakeak.com, 2019.

[3] Garbow, Burton S., EISPACK — A package of matrix eigensystem routines, Computer Physics Communications, Vol. 7, 1974

[4] Press, W.H. et al, Numerical Recipes in C (2nd ed.), Cambridge University Press, 1992.

Leave a comment