# You've Got Class, Mr Cholesky!

Last time we took a first look at the Cholesky decomposition[1], being the unique way to represent a positive definite symmetric square matrix $$\mathbf{M}$$ as the product of a lower triangular matrix $$\mathbf{L}$$ having strictly positive elements on its leading diagonal, and by definition zeroes everywhere above it, with its upper triangular transpose $$\mathbf{L}^\mathrm{T}$$, formed by switching the row and column indices of each element so that $$\mathbf{L}^\mathrm{T}_{ij} = \mathbf{L}_{ji}$$
$\mathbf{M} = \mathbf{L} \times \mathbf{L}^\mathrm{T}$
Finally we noted that, like the Jacobi decomposition[2], the Cholesky decomposition is useful for solving simultaneous equations, inverting matrices and so on, but left the implementation details to this post.

Before we get started, however, we should probably note that to prove that such a decomposition always exists we effectively derived an algorithm for finding it with the formulae
\begin{align*} \mathbf{L}_{00} &= \sqrt{\mathbf{M}_{00}}\\ \mathbf{L}_{i0} &= \frac{\mathbf{M}_{i0}}{\mathbf{L}_{00}} && \mathrm{for}\;i>0\\ \mathbf{L}_{jj} &= \sqrt{\mathbf{M}_{jj} - \sum_{k=0}^{j-1} \mathbf{L}_{jk} \times \mathbf{L}_{jk}} && \mathrm{for}\;j>0\\ \mathbf{L}_{ij} &= \frac{\mathbf{M}_{ij} - \sum_{k=0}^{j-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk}}{\mathbf{L}_{jj}} && \mathrm{for}\;j>0, i>j\\ \mathbf{L}_{ij} &= 0 && \mathrm{for}\;j>0, i<j \end{align*}
where $$\sum$$ is the summation sign.

### A Cholesky Decomposition Class

The implementation of the ak.choleskyDecomposition class that we'll use to represent the Cholesky decomposition follows our usual scheme of using closures to keep its state private, as illustrated by listing 1.

Listing 1: ak.choleskyDecomposition
ak.CHOLESKY_DECOMPOSITION_T = 'ak.choleskyDecomposition';

function CholeskyDecomposition(){}
CholeskyDecomposition.prototype = {
TYPE: ak.CHOLESKY_DECOMPOSITION_T,
valueOf: function(){return ak.NaN;}
};

ak.choleskyDecomposition = function(arg) {
var cd    = new CholeskyDecomposition();
var state = {l:undefined};

constructors[ak.type(arg)](state, arg);

cd.l = function() {return state.l;};

cd.toString = function() {return '{l:'+state.l.toString()+'}';};

cd.toExponential = function(d) {return '{l:'+state.l.toExponential(d)+'}';};
cd.toFixed       = function(d) {return '{l:'+state.l.toFixed(d)+'}';};
cd.toPrecision   = function(d) {return '{l:'+state.l.toPrecision(d)+'}';};

return Object.freeze(cd);
};

var constructors = {};

Here we first define the type identifier ak.CHOLESKY_DECOMPOSITION_T and use it as value of the TYPE property of the CholeskyDecomposition class in preparation for our overloading mechanism, whilst explicitly disabling conversion to numbers by giving it a valueOf method that simply returns NaN.
In the body of ak.choleskyDecomposition, we dispatch to the constructors object to initialise the l member of its private state object with the $$\mathbf{L}$$ matrix of the decomposition, adding the method l to the resulting object to provide access to it. Finally, we add a toMatrix method to reconstruct the matrix $$\mathbf{M}$$ in addition to the standard, trivial, string conversion methods.

### ak.choleskyDecomposition Constructors

Given that there is no need for a convergence threshold, ak.choleskyDecomposition has relatively few constructors compared to ak.jacobiDecomposition, as shown in listing 2.

Listing 2: ak.choleskyDecomposition Constructors
constructors[ak.MATRIX_T] = function(state, m) {
if(m.rows()!==m.cols()) {
throw new Error('non-square matrix in ak.choleskyDecomposition');
}
state.l = isLower(m) && !isUpper(m) ? fromLower(m) : fromMatrix(m);
};

constructors[ak.OBJECT_T] = function(state, obj) {
var l = obj.l;

if(ak.nativeType(l)===ak.FUNCTION_T) l = l();
if(ak.type(l)!==ak.MATRIX_T || l.rows()!==l.cols() || !isLower(l)) {
throw new Error('invalid l in ak.choleskyDecomposition');
}
state.l = fromLower(l);
};

constructors[ak.CHOLESKY_DECOMPOSITION_T] = constructors[ak.OBJECT_T];

An important point to note is that we should like to be able to initialise an ak.choleskyDecomposition with either a matrix to be decomposed or a lower triangular matrix representing the single element of the decomposition but, given that they'd both be ak.matrix objects[3], there's no way that we can distinguish between them with our dispatch mechanism.
Instead, we have the helper functions isLower and isUpper, given in listing 3, to check at runtime in the ak.MATRIX_T constructor which type of initialisation is required, with the assumption that a diagonal matrix, being both lower and upper triangular, will not typically be intended to be interpreted as the element of the decomposition.

Listing 3: The isLower And isUpper Functions
function isLower(m) {
var n = m.rows();
var i, j;

for(i=0;i<n;++i) {
for(j=i+1;j<n;++j) {
if(m.at(i, j)!==0) return false;
}
}
return true;
}

function isUpper(m) {
var n = m.rows();
var i, j;

for(i=0;i<n;++i) {
for(j=0;j<i;++j) {
if(m.at(i, j)!==0) return false;
}
}
return true;
}

Here we rely upon the calling constructor to have checked that the matrix is square and simply assume that its number of rows equals its number of columns. That very slight subtlety noted, these functions simply search for a non-zero element with a column index greater than its row index and one with a column index less than its row index to determine if the matrix is not lower or upper triangular respectively.
Note that when constructing from another ak.choleskyDecomposition object, or any object supporting a similar interface, we require that the property l be a lower triangular square matrix, including diagonal matrices this time, and so we use the isLower function as part of the test to ensure that this is in fact the case.

Now in every case where we initialise the state with a lower triangular matrix, we do so by calling the fromLower helper function, given in listing 4.

Listing 4: The fromLower Function
function fromLower(l) {
var n = l.rows();
var i, j, x;

l = l.toArray();

for(i=0;i<n;++i) {
x = l[i][i];
if(x===0) throw new Error('invalid l in ak.choleskyDecomposition');
if(x<0) {
for(j=i;j<n;++j) l[j][i] = -l[j][i];
}
}
return ak.matrix(l);
}

This function ensures that the elements on the leading diagonal are strictly positive by throwing an error if any of them are zero and negate the entire column of any that are negative, exploiting the fact that doing so has no effect upon the product of $$\mathbf{L}$$ and $$\mathbf{L}^\mathrm{T}$$, as we proved last time.

The final case is initialisation with a non lower triangular matrix that is to be decomposed, which we do with the aid of the fromMatrix helper function given in listing 5.

Listing 5: The fromMatrix Function
function fromMatrix(m) {
var n = m.rows();
var i, j, k, mj, mi, s, x;

m = m.toArray();

for(j=0;j<n;++j) {
mj = m[j];
s = mj[j];
for(k=0;k<j;++k) s -= mj[k]*mj[k];
if(!(s>0)) {
throw new Error('non-positive definite matrix in ak.choleskyDecomposition');
}

s = Math.sqrt(s);
mj[j] = s;
for(i=j+1;i<n;++i) {
mi = m[i];
x = 0.5*(mi[j]+mj[i]);
for(k=0;k<j;++k) x -= mi[k]*mj[k];
mi[j] = x/s;
mj[i] = 0;
}
}
return ak.matrix(m);
}

The first thing to note about this function is that the formulae for constructing $$\mathbf{L}$$ from $$\mathbf{M}$$ are applied in-place. We can do this because the only element of $$\mathbf{M}$$ that we need is in the same place as the one that we are currently calculating for $$\mathbf{L}$$ and, provided that we calculate the columns of $$\mathbf{L}$$ from the left to the right and the elements within them from the diagonal to the bottom, all of the elements of $$\mathbf{L}$$ that we need are already in place. A cursory glance should convince you that this is exactly how this function proceeds!

The next thing to note is that we use the average of $$\mathbf{M}_{ij}$$ and $$\mathbf{M}_{ji}$$ for the off-diagonal elements of $$\mathbf{M}$$. The means that we are actually computing the Cholesky decomposition of the average of $$\mathbf{M}$$ and its transpose, just as we did for the Jacobi decomposition, which we do, and did, because this is guaranteed to be symmetric and so we can avoid testing for symmetry and the difficulties in doing so that might arise if rounding errors had accumulated in the matrix.

Finally, we're throwing an error if the terms inside the square roots in the formulae are not greater than zero rather than less than or equal to zero. This means that we're treating NaNs as less than or equal to zero, something that we didn't do when directly initialising the decomposition with lower triangular matrices. We do this here because the square roots of negative numbers and the square root of NaN are all NaN and so this makes the calculation internally consistent. We didn't do it before because if the user wants NaNs on the leading diagonal then that's their own damned business!

Program 1 demonstrates that ak.choleskyDecomposition correctly recovers a lower triangular matrix from its product with its transpose.

Program 1: Constructing The Cholesky Decomposition

### Reconstructing M

To recover the matrix that the Cholesky decomposition represents
$\mathbf{M} = \mathbf{L} \times \mathbf{L}^\mathrm{T}$
we can exploit the fact that $$\mathbf{L}$$ is lower triangular and $$\mathbf{M}$$ is symmetric with
$\mathbf{M}_{ij} = \mathbf{M}_{ji} = \sum_{k=0}^{n-1} \mathbf{L}_{ik} \times \mathbf{L}^\mathrm{T}_{kj} = \sum_{k=0}^{n-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk} = \sum_{k=0}^{j} \mathbf{L}_{ik} \times \mathbf{L}_{jk} \quad \mathrm{for} \; j \leqslant i$
since the elements of $$\mathbf{L}$$ whose column indices are greater than their row indices are, by definition, equal to zero. Reconstructing $$\mathbf{M}$$ in this way requires approximately one sixth of the arithmetic operations that naively multiplying $$\mathbf{L}$$ and its transpose does and we use it in the implementation of the toMatrix method, as shown by listing 6.

Listing 6: The toMatrix Function
function toMatrix(l) {
var n = l.rows();
var m = new Array(n);
var i, j, k, s;

for(i=0;i<n;++i) m[i] = new Array(n);

for(i=0;i<n;++i) {
for(j=0;j<=i;++j) {
s = 0;
for(k=0;k<=j;++k) s += l.at(i, k)*l.at(j, k);
m[i][j] = m[j][i] = s;
}
}
return ak.matrix(m);
}

The full set of overloads for ak.choleskyDecomposition is given in listing 7.

var COMPLEX_T = 'ak.complex';
var COMPLEX_VECTOR_T = 'ak.complexVector';
var COMPLEX_MATRIX_T = 'ak.complexMatrix';

Note that we're using the trick that we first used for ak.complexVector[4] to support overloads for types whose implementations we don't want to implicitly load along with that of ak.choleskyDecomposition, in this case division involving complex numbers, vectors and matrices, by defining new variables containing their type strings so that the overload mechanism can dispatch to appropriate helper functions and relying upon the user to include both the implementation of this class and those of any of the complex types that they wish to use so that they will be defined by the time that those overloads might be called.

The simplest of these overloads is for the determinant, ak.det, which we proved last time is simply the product of the squares of the terms on the leading diagonal of $$\mathrm{L}$$
$|\mathbf{M}| = \prod_{i=0}^{n-1} \mathbf{L}_{ii}^2$
where $$\prod$$ is the product sign.
Listing 8 gives its implementation, which applies the slight optimisation of squaring the product of the leading diagonal elements rather than computing the product of their squares

Listing 8: The Determinant
function det(d) {
var l = d.l();
var n = l.rows();
var x = 1;
var i;

for(i=0;i<n;++i) x *= l.at(i, i);
return x*x;
}

Program 2 compares the calculation of the determinant using the Cholesky decomposition of a matrix with that using the matrix itself.

Program 2: Computing The Determinant

To invert $$\mathbf{M}$$ using the Cholesky decomposition, we first note that
$\mathbf{M}^{-1} = \left(\mathbf{L} \times \mathbf{L}^\mathrm{T}\right)^{-1} = \mathbf{L}^{\mathrm{T}^{-1}} \times \mathbf{L}^{-1} = \mathbf{U}^{-1} \times \mathbf{U}^{\mathrm{T}^{-1}}$
Next, the inverse of an upper triangular matrix $$\mathbf{U}$$ is itself upper triangular with elements defined by
\begin{align*} \mathbf{U}^{-1}_{ii} &= \frac{1}{\mathbf{U}_{ii}}\\ \mathbf{U}^{-1}_{ij} &= -\frac{\sum_{k=i}^{j-1} \mathbf{U}^{-1}_{ik} \times \mathbf{U}_{kj}}{\mathbf{U}_{jj}} \quad \mathrm{for} \; i < j \end{align*}
as proven in derivation 1.

Derivation 1: The Inverses Of Upper Triangular Matrices
First, let us simply assume that the inverse of an upper triangular matrix $$\mathbf{U}$$ is itself upper triangular so that we have
\begin{align*} \left(\mathbf{U}^{-1} \times \mathbf{U}\right)_{ij} &= \sum_{k=0}^{n-1} \mathbf{U}^{-1}_{ik} \times \mathbf{U}_{kj} = \sum_{k=i}^{j} \mathbf{U}^{-1}_{ik} \times \mathbf{U}_{kj} && \mathrm{for} \; i \leqslant j\\ \left(\mathbf{U}^{-1} \times \mathbf{U}\right)_{ij} &= 0 && \mathrm{for} \; i > j \end{align*}
since the elements of upper triangular matrices whose row indices are greater than their column indices are, by definition, zero. Furthermore, the products of matrices and their inverses must be the identity matrix, and so we require that
\begin{align*} \left(\mathbf{U}^{-1} \times \mathbf{U}\right)_{ii} &= 1\\ \left(\mathbf{U}^{-1} \times \mathbf{U}\right)_{ij} &= 0 \quad \mathrm{for} \; i < j \end{align*}
Next, rearranging the formula for the elements of the product of the $$\mathbf{U}^{-1}$$ and $$\mathbf{U}$$ yields
\begin{align*} \mathbf{U}^{-1}_{ii} &= \frac{1}{\mathbf{U}_{ii}}\\ \mathbf{U}^{-1}_{ij} &= -\frac{\sum_{k=i}^{j-1} \mathbf{U}^{-1}_{ik} \times \mathbf{U}_{kj}}{\mathbf{U}_{jj}} \quad \mathrm{for} \; i < j \end{align*}
Finally, the only way that these formulae can break down is if any of the elements on the leading diagonal of $$\mathbf{U}$$ are zero, in which case its determinant, being equal to the product of those elements, must also be equal to zero, meaning that it doesn't have an inverse. Therefore every invertible upper triangle matrix has an upper triangular inverse.

Finally, the elements of the product of an upper triangular matrix $$\mathbf{U}$$ and its transpose are given by
$\left(\mathbf{U} \times \mathbf{U}^\mathrm{T}\right)_{ij} = \sum_{k=0}^{n-1} \mathbf{U}_{ik} \times \mathbf{U}^\mathrm{T}_{kj} = \sum_{k=0}^{n-1} \mathbf{U}_{ik} \times \mathbf{U}_{jk} = \sum_{k=j}^{n-1} \mathbf{U}_{ik} \times \mathbf{U}_{jk} \quad \mathrm{for} \; i \leqslant j$
Now, the product of any matrix with its transpose is symmetric, so we have
$\left(\mathbf{U} \times \mathbf{U}^\mathrm{T}\right)_{ij} = \left(\mathbf{U} \times \mathbf{U}^\mathrm{T}\right)_{ji} = \sum_{k=j}^{n-1} \mathbf{U}_{ik} \times \mathbf{U}_{jk} \quad \mathrm{for} \; i \leqslant j$
Note that we can also apply this formula in-place since for each element of the product we only need elements of $$\mathbf{U}$$ in rows after it and columns not before it and, once again, it should be fairly obvious that this is exactly how the inv function, given in listing 9, proceeds.

Listing 9: The Inverse
function invArray(d) {
var l = d.l();
var n = l.rows();
var m = new Array(n);
var i, j, k, mi, mj, s;

for(i=0;i<n;++i) {
mi = new Array(n);
mi[i] = 1 / l.at(i, i);

for(j=i+1;j<n;++j) {
s = 0;
for(k=i;k<j;++k) s += mi[k] * l.at(j, k);
mi[j] = -s / l.at(j, j);
}
m[i] = mi;
}

for(i=0;i<n;++i) {
mi = m[i];
for(j=i;j<n;++j) {
mj = m[j];

s = 0;
for(k=j;k<n;++k) s += mi[k] * mj[k];
mi[j] = mj[i] = s;
}
}

return m;
}

function inv(d) {
return ak.matrix(invArray(d));
}

Program 3 demonstrates that this does indeed compute the inverse by showing that its product with the original matrix is the identity matrix.

Program 3: Computing The Inverse

Now the reason that we use the invArray helper function to perform the actual calculation is so that we can also use it for dividing a number by a matrix, which is equivalent to multiplying its inverse by that number as is done by the divRD function given in listing 10.

Listing 10: Dividing A Number By A Cholesky Decomposition
function divRD(r, d) {
var m = invArray(d);
var n = m.length;
var i, j, mi;

for(i=0;i<n;++i) {
mi = m[i];
for(j=0;j<n;++j) mi[j] *= r;
}
return ak.matrix(m);
}

Dividing a vector $$\mathbf{c}$$ by a Cholesky decomposition means finding the vector $$\mathbf{x}$$ that satisfies
$\mathbf{L} \times \mathbf{L}^\mathrm{T} \times \mathbf{x} = \mathbf{c}$
As noted last time, we can do this in two steps with
\begin{align*} \mathbf{L} \times \mathbf{y} &= \mathbf{c}\\ \mathbf{L}^\mathrm{T} \times \mathbf{x} &= \mathbf{y} \end{align*}
from which we can deduce the formulae
\begin{align*} \mathbf{y}_0 &= \frac{\mathbf{c}_0}{\mathbf{L}_{00}}\\ \mathbf{y}_i &= \frac{\mathbf{c}_i - \sum_{k=0}^{i-1} \mathbf{L}_{ik} \times \mathbf{y}_k}{\mathbf{L}_{ii}} && \mathrm{for} \; i > 0\\ \mathbf{x}_{n-1} &= \frac{\mathbf{y}_{n-1}}{\mathbf{L}_{(n-1) (n-1)}}\\ \mathbf{x}_i &= \frac{\mathbf{y}_i - \sum_{k=i+1}^{n-1} \mathbf{L}_{ki} \times \mathbf{x}_k}{\mathbf{L}_{ii}} && \mathrm{for} \; i < n-1 \end{align*}
as proven in derivation 2.

Derivation 2: Solving For x
For the first step we have
$\left(\mathbf{L} \times \mathbf{y}\right)_i = \sum_{k=0}^{n-1} \mathbf{L}_{ik} \times \mathbf{y}_k = \sum_{k=0}^{i} \mathbf{L}_{ik} \times \mathbf{y}_k = \mathbf{c}_i$
since the elements of $$\mathbf{L}$$ whose column indices are greater than their row indices are equal to zero, and consequently
\begin{align*} \mathbf{y}_0 &= \frac{\mathbf{c}_0}{\mathbf{L}_{00}}\\ \mathbf{y}_i &= \frac{\mathbf{c}_i - \sum_{k=0}^{i-1} \mathbf{L}_{ik} \times \mathbf{y}_k}{\mathbf{L}_{ii}} \quad \mathrm{for} \; i > 0 \end{align*}
For the second step we have
$\left(\mathbf{L}^\mathrm{T} \times \mathbf{x}\right)_i = \sum_{k=0}^{n-1} \mathbf{L}^\mathrm{T}_{ik} \times \mathbf{x}_k = \sum_{k=0}^{n-1} \mathbf{L}_{ki} \times \mathbf{x}_k = \sum_{k=i}^{n-1} \mathbf{L}_{ki} \times \mathbf{x}_k = \mathbf{y}_i$
for the same reason, and so
\begin{align*} \mathbf{x}_{n-1} &= \frac{\mathbf{y}_{n-1}}{\mathbf{L}_{(n-1)(n-1)}}\\ \mathbf{x}_i &= \frac{\mathbf{y}_i - \sum_{k=i+1}^{n-1} \mathbf{L}_{ki} \times \mathbf{x}_k}{\mathbf{L}_{ii}} \quad \mathrm{for} \; i < n-1 \end{align*}

These formulae can also be applied in-place since the elements of $$\mathbf{y}$$ don't depend upon the elements of either itself or of $$\mathbf{c}$$ that come after them and the elements of $$\mathbf{x}$$ don't depend upon the elements of either itself or of $$\mathbf{y}$$ that come before them, as exploited by the divVD function given in listing 11.

Listing 11: Dividing A Vector By A Cholesky Decomposition
function divVD(v, d) {
var l = d.l();
var n = l.rows();
var i, k, s;

if(v.dims()!==n) {
throw new Error('dimensions mismatch in ak.choleskyDecomposition div');
}
v = v.toArray();

for(i=0;i<n;++i) {
s = 0;
for(k=0;k<i;++k) s += l.at(i, k) * v[k];
v[i] = (v[i]-s) / l.at(i, i);
}

for(i=n-1;i>=0;--i) {
s = 0;
for(k=i+1;k<n;++k) s += l.at(k, i) * v[k];
v[i] = (v[i]-s) / l.at(i, i);
}

return ak.vector(v);
}

Program 4 demonstrates that division of a vector $$\mathbf{v}$$ by the Cholesky decomposition of a matrix $$\mathbf{M}$$ yields a vector $$\mathbf{x}$$ that satisfies
$\mathbf{M} \times \mathbf{x} = \mathbf{v}$
Program 4: Solving Simultaneous Equations

Now the division of a matrix by a Cholesky decomposition proceeds in exactly the same fashion, acting upon each column of the matrix in turn, as is done by the divMD function given in listing 12.

Listing 12: Dividing A Matrix By A Cholesky Decomposition
function divMD(m, d) {
var l = d.l();
var n = l.rows();
var c = m.cols();
var i, j, k, s;

if(m.rows()!==n) {
throw new Error('dimensions mismatch in ak.choleskyDecomposition div');
}
m = m.toArray();

for(j=0;j<c;++j) {
for(i=0;i<n;++i) {
s = 0;
for(k=0;k<i;++k) s += l.at(i, k) * m[k][j];
m[i][j] = (m[i][j]-s) / l.at(i, i);
}

for(i=n-1;i>=0;--i) {
s = 0;
for(k=i+1;k<n;++k) s += l.at(k, i) * m[k][j];
m[i][j] = (m[i][j]-s) / l.at(i, i);
}
}

return ak.matrix(m);
}

As you might expect, the division operators involving complex numbers, vectors and matrices simply require that we use ak overloaded arithmetic operators rather than JavaScript's native operators, as illustrated for division of complex vectors by Cholesky decompositions with the divZD function given in listing 13.

Listing 13: Dividing A Complex Vector By A Cholesky Decomposition
function divZD(z, d) {
var l = d.l();
var n = l.rows();
var i, k, s;

if(z.dims()!==n) {
throw new Error('dimensions mismatch in ak.choleskyDecomposition div');
}
z = z.toArray();

for(i=0;i<n;++i) {
s = 0;
for(k=0;k<i;++k) s = ak.add(s, ak.mul(l.at(i, k), z[k]));
z[i] = ak.div(ak.sub(z[i], s), l.at(i, i));
}

for(i=n-1;i>=0;--i) {
s = 0;
for(k=i+1;k<n;++k) s = ak.add(s, ak.mul(l.at(k, i), z[k]));
z[i] = ak.div(ak.sub(z[i], s), l.at(i, i));
}

return ak.complexVector(z);
}

You can find the implementation of division involving complex numbers and matrices along with the rest of the implementation of the ak.choleskyDecomposition class in CholeskyDecomposition.js.

### References

[1] Are You Definitely Positive, Mr Cholesky?, www.thusspakeak.com, 2015.

[2] Conquering The Eigen, www.thusspakeak.com, 2014.

[3] The Matrix Recoded, www.thusspakeak.com, 2014.

[4] What Are You Pointing At?, www.thusspakeak.com, 2015.