Are You Definitely Positive, Mr Cholesky?


Some time ago, we took a look at the Jacobi decomposition[1] which is used to find the eigensystems of real symmetric matrices being, for a matrix \(\mathbf{M}\), those vectors \(\mathbf{v}_i\) and numbers \(\lambda_i\) that satisfy
\[ \mathbf{M} \times \mathbf{v}_i = \lambda_i \times \mathbf{v}_i \]
We saw that the eigenvectors of real symmetric matrices are orthogonal, or in other words at right angles to each other, and that we could construct a matrix \(\mathbf{V}\) whose columns are those eigenvectors and another \(\mathbf{\Lambda}\) whose diagonal elements are their associated eigenvalues and are everywhere else zero, from which we could reconstruct the matrix with
\[ \mathbf{M} = \mathbf{V} \times \mathbf{\Lambda} \times \mathbf{V}^\mathrm{T} \]
where the \(\mathrm{T}\) is the transpose operator which exchanges the rows with the columns of the matrix that is a superscript of.
This representation of a matrix turned out to have some tremendously useful consequences, such as dramatically simplifying both the solution of simultaneous linear equations and the calculation of functions of matrices like the inverse and, crucially in the context of this post, the square root.

Now, before I get to why the latter is so crucial, we shall need a refresher on the Gaussian elimination algorithm that we used to implement the inverse function for our ak.matrix class[2] and that I unforgivably glossed over when showing its implementation for our ak.complexMatrix last time[3].

Eliminate! Eliminate!

Perhaps the easiest way to explain the Gaussian elimination algorithm is to use it to solve a set of simultaneous linear equations such as
\[ \begin{align*} m_{00} \times x_0 + m_{01} \times x_1 + m_{02} \times x_2 &= c_0\quad\quad(1)\\ m_{10} \times x_0 + m_{11} \times x_1 + m_{12} \times x_2 &= c_1\quad\quad(2)\\ m_{20} \times x_0 + m_{21} \times x_1 + m_{22} \times x_2 &= c_2\quad\quad(3) \end{align*} \]
where the \(x_i\) terms are unknown and the remaining terms are known.

If we multiply all of the terms in equation \((1)\) by \(m_{01}\) and those in equation \((2)\) by \(m_{00}\) we get two new equations
\[ \begin{align*} m_{10} \times m_{00} \times x_0 + m_{10} \times m_{01} \times x_1 + m_{10} \times m_{02} \times x_2 &= m_{10} \times c_0\quad\quad(4)\\ m_{00} \times m_{10} \times x_0 + m_{00} \times m_{11} \times x_1 + m_{00} \times m_{12} \times x_2 &= m_{00} \times c_1\quad\quad(5) \end{align*} \]
Subtracting equation \((5)\) from equation \((4)\) yields
\[ \left(m_{10} \times m_{01} - m_{00} \times m_{11}\right) \times x_1 + \left(m_{10} \times m_{02} - m_{00} \times m_{12}\right) \times x_2 = m_{10} \times c_0 - m_{00} \times c_1\quad\quad(6) \]
Repeating this process with equations \((2)\) and \((3)\) yields
\[ \left(m_{20} \times m_{11} - m_{10} \times m_{21}\right) \times x_1 + \left(m_{20} \times m_{12} - m_{10} \times m_{22}\right) \times x_2 = m_{20} \times c_1 - m_{10} \times c_2\quad\quad(7) \]
which, given that we know the values of the terms \(m_{ij}\) and \(c_i\), we might rewrite as
\[ \begin{align*} n_{01} \times x_1 + n_{02} \times x_2 &= d_1\quad\quad(8)\\ n_{11} \times x_1 + n_{12} \times x_2 &= d_2\quad\quad(9) \end{align*} \]
eliminating \(x_0\) from our unknown terms.

Finally, if we subtract the product of equation \((9)\) and \(n_{01}\) from that of equation \((8)\) and \(n_{11}\), we get
\[ \left(n_{11} \times n_{02} - n_{01} \times n_{12}\right) \times x_2 = n_{11} \times d_1 - n_{01} \times d_2\quad\quad(10) \]
which we can rewrite as
\[ o_{02} \times x_2 = e_2\quad\quad(11) \]
eliminating \(x_1\) from our unknown terms too!

Putting equations \((1)\), \((8)\) and \((11)\) together we have
\[ \begin{align*} m_{00} \times x_0 + m_{01} \times x_1 + m_{02} \times x_2 &= c_0\quad\quad(1)\\ n_{01} \times x_1 + n_{02} \times x_2 &= d_1\quad\quad(8)\\ o_{02} \times x_2 &= e_2\quad\quad(11) \end{align*} \]
and equation \((11)\) makes it obvious that
\[ x_2 = \frac{e_2}{o_{02}} \]
Having found the correct value of \(x_2\) we can find that of \(x_1\) by rearranging equation \((8)\)
\[ x_1 = \frac{d_1 - n_{02} \times x_2}{n_{01}} \]
and finally, furnished as we now are with the values of both \(x_1\) and \(x_2\), the value of \(x_0\) is revealed by rearranging equation \((1)\)
\[ x_0 = \frac{c_0 - m_{01} \times x_1 - m_{02} \times x_2}{m_{00}} \]
Note that the terms in equations \((1)\), \((2)\) and \((3)\) were named specifically to reflect the fact that they can be represented as a single equation involving a square matrix \(\mathbf{M}\) and vectors \(\mathbf{x}\) and \(\mathbf{c}\)
\[ \mathbf{M} \times \mathbf{x} = \mathbf{c} \]
To turn the solution of such equations into an algorithm for inverting matrices simply requires solving the series of equations
\[ \mathbf{M} \times \mathbf{x}_i = \mathbf{c}_i \]
where each \(\mathbf{c}_i\) is a vector whose elements are all equal to zero, except for the \(i^\mathrm{th}\) which equals one, or in other words is the \(i^\mathrm{th}\) column of the identity matrix \(\mathbf{I}\), and consequently each resulting \(\mathbf{x}_i\) is the \(i^\mathrm{th}\) column of the inverse of \(\mathbf{M}\), denoted \(\mathbf{M}^{-1}\) and satisfying
\[ \mathbf{M} \times \mathbf{M}^{-1} = \mathbf{M}^{-1} \times \mathbf{M} = \mathbf{I} \]
This is an admittedly rather naive implementation of Gaussian elimination, given that it manipulates the equations in the order in which they were written down rather than in the order that minimises the possibility of numerical errors, but it does at least illustrate the basic idea.

Always With The Triangles, Your Solution To Everything Is Triangles!

Now obviously solving such matrix-vector equations would be massively simplified if they were already in the form of equations \((1)\), \((8)\) and \((11)\), or in other words if all of their elements below the leading diagonal, being those whose row indices are greater than their column indices, were equal to zero.
Such matrices are known as upper triangular matrices since their non-zero elements take the form of a triangle in their top-right. Similarly, matrices whose elements above the leading diagonal are all equal to zero are known as lower triangular matrices.
For example
\[ \begin{align*} \mathbf{U} &= \begin{pmatrix} a & b & c\\ 0 & d & e\\ 0 & 0 & f \end{pmatrix} \\ \\ \mathbf{L} &= \begin{pmatrix} a & 0 & 0\\ b & d & 0\\ c & e & f \end{pmatrix} \end{align*} \]
shows an upper triangular matrix \(\mathbf{U}\) and a lower triangular matrix \(\mathbf{L}\) that are the transposes of each other.

Derivation 1: The Determinants Of Triangular Matrices
Given an \(n\) by \(n\) matrix \(\mathbf{M}\) and defining \(\mathbf{M}^\prime_i\) to be the \(n-1\) by \(n-1\) matrix that is the result of removing the first row and the \(i^\mathrm{th}\) column from \(\mathbf{M}\), the determinant of \(\mathbf{M}\) is equal to
\[ |\mathbf{M}| = \sum_{i=0}^{n-1} (-1)^i \times \mathbf{M}_{0i} \times |\mathbf{M}^\prime_i| \]
where \(\sum\) is the summation sign.

For a lower triangular matrix, the only non-zero element in the first row is that in the first column and the result of removing the first row and the first column is also a lower triangular matrix.
Since the determinant of a one by one matrix is equal to its only element, recursively applying the above formula to a lower triangular matrix is equivalent to multiplying the elements on its leading diagonal.

Since the determinants of a matrix and its transpose are equal, the same is trivially true for upper triangular matrices.
An important property of such matrices is that their determinants are equal to the product of the elements on their leading diagonals, as shown by derivation 1.
This also means that those elements are the matrix's eigenvalues \(\lambda_i\) since for their associated eigenvectors \(\mathbf{v}_i\) we have
\[ \begin{align*} \mathbf{M} \times \mathbf{v}_i &= \lambda_i \times \mathbf{v}_i\\ \mathbf{M} \times \mathbf{v}_i - \lambda_i \times \mathbf{v}_i &= \mathbf{0}\\ \left(\mathbf{M} - \lambda_i \times \mathbf{I}\right) \times \mathbf{v}_i &= \mathbf{0}\\ \left|\mathbf{M} - \lambda_i \times \mathbf{I}\right| &= 0 \end{align*} \]
If \(\mathbf{M}\) is triangular, then so is \(\mathbf{M} - \lambda_i \times \mathbf{I}\) and we have
\[ \left|\mathbf{M} - \lambda_i \times \mathbf{I}\right| = \prod_j \left(\mathbf{M}_{jj} - \lambda_i\right) \]
where \(\prod\) is the product sign and which is zero if and only if \(\lambda_i\) is equal to one of the elements on the leading diagonal of \(\mathbf{M}\).

Now, just as we can solve equations of the form
\[ \mathbf{U} \times \mathbf{x} = \mathbf{c} \]
by directly finding the last element of \(\mathbf{x}\) and then sweeping back to the first, moving the terms that we've already found over to the right hand side at each step, so we can solve equations of the form
\[ \mathbf{L} \times \mathbf{x} = \mathbf{c} \]
by directly finding the first element of \(\mathbf{x}\) and then sweeping forward to the last, moving known terms over to the right hand side as we go.

Putting these together means that if we have an equation of the form
\[ \mathbf{L} \times \mathbf{L}^\mathrm{T} \times \mathbf{x} = \mathbf{c} \]
then we can solve for \(\mathbf{x}\) in two steps with
\[ \begin{align*} \mathbf{L} \times \mathbf{y} &= \mathbf{c}\\ \mathbf{L}^\mathrm{T} \times \mathbf{x} &= \mathbf{y} \end{align*} \]
You might be forgiven for thinking that this isn't much of an improvement over the two steps that we needed for Gaussian elimination, the first to put the equations into an upper triangular form and the second to find the unknown values by sweeping backwards through them, putting in the values of those that we have already found as we go.

There is a crucial difference, however.

Since \(\mathbf{L}\) is already triangular, there can be no question whatsoever about what might be the most numerically stable sequence of operations to transform it into a triangular matrix; just leave it as it is!
Now, if you've spotted the massive unspoken assumption that I've just made then I congratulate you, because it's a rather subtle one!

But How Many Solutions Are There?

Derivation 2: The Possible Values Of L
If we define
\[ \mathbf{M} = \mathbf{L} \times \mathbf{L}^\mathrm{T} \]
then by the rules of matrix multiplication and the definition of the transpose, we have for \(n\) by \(n\) \(\mathbf{L}\)
\[ \mathbf{M}_{ij} = \sum_{k=0}^{n-1} \mathbf{L}_{ik} \times \mathbf{L}^\mathrm{T}_{kj} = \sum_{k=0}^{n-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk} \]
and so \(\mathbf{M}\) is trivially a symmetric matrix where
\[ \mathbf{M}_{ij} = \mathbf{M}_{ji} \]
and so we need only calculate its elements for which \(i\) is greater than or equal to \(j\) to recover them all.

Now, if \(\mathbf{L}\) is lower triangular then its elements that have column indices greater than their row indices are equal to zero and so
\[ \mathbf{M}_{ij} = \sum_{k=0}^j \mathbf{L}_{ik} \times \mathbf{L}_{jk} + \sum_{k=j+1}^{n-1} \mathbf{L}_{ik} \times 0 = \sum_{k=0}^j \mathbf{L}_{ik} \times \mathbf{L}_{jk} \]
From this we obtain
\[ \begin{align*} \mathbf{M}_{00} &= \mathbf{L}_{00} \times \mathbf{L}_{00}\\ \mathbf{L}_{00} &= \pm \sqrt{\mathbf{M}_{00}} \end{align*} \]
and, for \(i\) greater than zero
\[ \begin{align*} \mathbf{M}_{i0} &= \mathbf{L}_{i0} \times \mathbf{L}_{00}\\ \mathbf{L}_{i0} &= \frac{\mathbf{M}_{i0}}{\mathbf{L}_{00}} \end{align*} \]
and so the first column of \(\mathbf{L}\) is entirely determined by the first column of \(\mathbf{M}\), with the exception that all of its elements may be multiplied by minus one.

Similarly, for \(j\) greater than zero, we have
\[ \begin{align*} \mathbf{M}_{jj} &= \sum_{k=0}^j \mathbf{L}_{jk} \times \mathbf{L}_{jk} = \sum_{k=0}^{j-1} \mathbf{L}_{jk} \times \mathbf{L}_{jk} + \mathbf{L}_{jj} \times \mathbf{L}_{jj}\\ \mathbf{L}_{jj} &= \pm \sqrt{\mathbf{M}_{jj} - \sum_{k=0}^{j-1} \mathbf{L}_{jk} \times \mathbf{L}_{jk}} \end{align*} \]
and, for \(i\) greater than \(j\)
\[ \begin{align*} \mathbf{M}_{ij} &= \sum_{k=0}^j \mathbf{L}_{ik} \times \mathbf{L}_{jk} = \sum_{k=0}^{j-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk} + \mathbf{L}_{ij} \times \mathbf{L}_{jj}\\ \mathbf{L}_{ij} &= \frac{\mathbf{M}_{ij} - \sum_{k=0}^{j-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk}}{\mathbf{L}_{jj}} \end{align*} \]
Note that for both \(\mathbf{L}_{jj}\) and \(\mathbf{L}_{ij}\), elements from columns of \(\mathbf{L}\) prior to the \(j^\mathrm{th}\) are always multiplied by another from the same column and so multiplying every element in a column by minus one will have no effect upon the values of elements in later columns.
Consequently, it is also the case that every element in the \(j^\mathrm{th}\) column is entirely determined by elements of \(\mathbf{M}\), with the exception that they may all be multiplied by minus one.
Whilst it is certainly true that we can't improve the numerical stability of either of the two steps taken on their own, we also need to take into consideration the fact that there are two steps.
Specifically, we must determine whether or not it is possible to find another lower triangular matrix \(\mathbf{J}\) such that
\[ \mathbf{J} \times \mathbf{J}^\mathrm{T} = \mathbf{L} \times \mathbf{L}^\mathrm{T} \]
and for which the solution of the equations
\[ \begin{align*} \mathbf{J} \times \mathbf{y} &= \mathbf{c}\\ \mathbf{J}^\mathrm{T} \times \mathbf{x} &= \mathbf{y} \end{align*} \]
is more numerically stable than that of the originals.

The first of these conditions is trivially met if we multiply every element in one or more columns by minus one, which we can do with
\[ \mathbf{J} = \mathbf{L} \times \mathbf{I}^\prime \]
where \(\mathbf{I}^\prime\) is a diagonal matrix whose elements are zero everywhere except the leading diagonal where they are plus or minus one.
This follows from the fact that such matrices are square roots of the identity matrix \(\mathbf{I}\) and so
\[ \begin{align*} \mathbf{J} \times \mathbf{J}^\mathrm{T} &= \mathbf{L} \times \mathbf{I}^\prime \times \left(\mathbf{L} \times \mathbf{I}^\prime\right)^\mathrm{T}\\ &= \mathbf{L} \times \mathbf{I}^\prime \times \mathbf{I}^{\prime \mathrm{T}} \times \mathbf{L}^\mathrm{T}\\ &= \mathbf{L} \times \mathbf{I}^\prime \times \mathbf{I}^\prime \times \mathbf{L}^\mathrm{T}\\ &=\mathbf{L} \times \mathbf{I} \times \mathbf{L}^\mathrm{T}\\ &=\mathbf{L} \times \mathbf{L}^\mathrm{T} \end{align*} \]
However, this is numerically identical to solving
\[ \begin{align*} \mathbf{L} \times \left(\mathbf{I}^\prime \times \mathbf{y}\right) &= \mathbf{c}\\ \mathbf{L}^\mathrm{T} \times \mathbf{x} &= \mathbf{I}^\prime \times \mathbf{y} \end{align*} \]
which only differs from solving the original equation in that some of the elements of the result of the first step might end up negated, but will be switched back to their original sign in the second step, and so the second condition isn't met.

Fortunately, these are the only lower triangular matrices that satisfy the first condition, as proven in derivation 2, and so we can be certain that the two step algorithm for the solution of such equations cannot be made any more numerically stable.

Now it is necessarily the case that matrices of this form are symmetric, since
\[ \left(\mathbf{A} \times \mathbf{B}\right)^\mathrm{T} = \mathbf{B}^\mathrm{T} \times \mathbf{A}^\mathrm{T} \]
and so
\[ \left(\mathbf{L} \times \mathbf{L}^\mathrm{T}\right)^\mathrm{T} = \left(\mathbf{L}^\mathrm{T}\right)^\mathrm{T} \times \left(\mathbf{L}\right)^\mathrm{T} = \mathbf{L} \times \mathbf{L}^\mathrm{T} \]
Furthermore they must be positive semidefinite, since for any vector \(\mathbf{v}\)
\[ \mathbf{v} \times \left(\mathbf{L} \times \mathbf{L}^\mathrm{T}\right) \times \mathbf{v} = \left(\mathbf{v} \times \mathbf{L}\right) \times \left(\mathbf{L}^\mathrm{T} \times \mathbf{v}\right) = \left(\mathbf{v} \times \mathbf{L}\right) \times \left(\mathbf{v} \times \mathbf{L}\right) = \mathbf{v}^\prime \times \mathbf{v}^\prime = \left|\mathbf{v}^\prime\right|^2 \geqslant 0 \]
which is the very definition of a positive semidefinite matrix.

The next question, therefore, is whether every positive semidefinite matrix can be uniquely represented as the product of a lower triangular matrix and its transpose.

In the course of proving that the only lower triangular matrices whose products with their transposes are the same as each other are those with the elements in some or all of their columns having their signs switched in derivation 2, we effectively specified an algorithm for finding them by iteratively applying the formulae
\[ \begin{align*} \mathbf{L}_{00} &= \sqrt{\mathbf{M}_{00}}\\ \mathbf{L}_{i0} &= \frac{\mathbf{M}_{i0}}{\mathbf{L}_{00}} && \mathrm{for} \; i > 0\\ \mathbf{L}_{jj} &= \sqrt{\mathbf{M}_{jj} - \sum_{k=0}^{j-1} \mathbf{L}_{jk} \times \mathbf{L}_{jk}} && \mathrm{for} \; j > 0\\ \mathbf{L}_{ij} &= \frac{\mathbf{M}_{ij} - \sum_{k=0}^{j-1} \mathbf{L}_{ik} \times \mathbf{L}_{jk}}{\mathbf{L}_{jj}} && \mathrm{for} \; j > 0, i > j\\ \mathbf{L}_{ij} &= 0 && \mathrm{for} \; j > 0, i < j \end{align*} \]
to find the lower triangular elements and setting the rest to zero.
For these formulae to have real and finite results, it must be the case that the terms inside the square root symbols are strictly positive, which implies that \(\mathbf{M}\) must be strictly positive definite, as proven in derivation 3.

Derivation 3: The Positive Definiteness Of M
First, let us assume that for a positive definite matrix \(\mathbf{M}\) and a lower triangular matrix \(\mathbf{L}\) satisfying
\[ \mathbf{M} = \mathbf{L} \times \mathbf{L}^\mathrm{T} \]
\(\mathbf{L}_{nn}\) is the first element on the leading diagonal of \(\mathbf{L}\) for which the expression inside the square root is not positive.

Next, we define a pair of \(n+1\) by \(n+1\) matrices \(\mathbf{A}\) and \(\mathbf{B}\) such that
\[ \begin{align*} \mathbf{A}_{ij} &= \mathbf{L}_{ij} &&\mathrm{for}\;i<n \vee j<n\\ \mathbf{A}_{nn} &= \mathbf{M}_{nn} - \sum_{k=0}^{n-1} \mathbf{L}_{nk} \times \mathbf{L}_{nk}\\ \\ \mathbf{B}_{ij} &= \mathbf{L}^\mathrm{T}_{ij} = \mathbf{L}_{ji} &&\mathrm{for}\;i<n \vee j<n\\ \mathbf{B}_{nn} &= 1 \end{align*} \]
where \(\vee\) means or.

Now the only element of the product of \(\mathbf{A}\) and \(\mathbf{B}\) for which the elements in their \(n^\mathrm{th}\) row and column are not multiplied by zero is
\[ \begin{align*} \left(\mathbf{A} \times \mathbf{B}\right)_{nn} &= \sum_{k=0}^n \mathbf{A}_{nk} \times \mathbf{B}_{kn} = \mathbf{A}_{nn} \times \mathbf{B}_{nn} + \sum_{k=0}^{n-1} \mathbf{A}_{nk} \times \mathbf{B}_{kn}\\ &= \mathbf{M}_{nn} - \sum_{k=0}^{n-1} \mathbf{L}_{nk} \times \mathbf{L}_{nk} + \sum_{k=0}^{n-1} \mathbf{L}_{nk} \times \mathbf{L}_{nk} = \mathbf{M}_{nn} \end{align*} \]
and consequently
\[ \left(\mathbf{A} \times \mathbf{B}\right)_{ij} = \mathrm{M}_{ij} \quad\mathrm{for}\;i \leqslant n, j \leqslant n \]
The determinants of triangular matrices equal the products of their diagonal elements and so the determinant of \(\mathbf{A}\) is less than or equal to zero and that of \(\mathbf{B}\) is greater than zero and so that of their product is less than or equal to zero. Since the determinant is equal to the product of the eigenvalues, this implies that this product has at least one non-positive eigenvalue \(\lambda\) and for its associated eigenvalue \(\mathbf{v}\) we have
\[ \mathbf{v} \times \left(\mathbf{A} \times \mathbf{B}\right) \times \mathbf{v} = \mathbf{v} \times \left(\lambda \times \mathbf{v}\right) = \lambda \times \left(\mathbf{v} \times \mathbf{v}\right) = \lambda \times \left|\mathbf{v}\right|^2 \leqslant 0 \]
and, by definition, it is not positive definite.

Finally, for a vector \(\mathbf{w}\) satisfying
\[ \begin{align*} \mathbf{w}_i &= \mathbf{v}_i && \mathrm{for}\;i \leqslant n\\ \mathbf{w}_i &= 0 && \mathrm{for}\;i > n \end{align*} \]
we have
\[ \begin{align*} \mathbf{w} \times \mathbf{M} \times \mathbf{w} &= \sum_i \sum_j \mathbf{w}_i \times \mathbf{M}_{ij} \times \mathbf{w}_j = \sum_{i=0}^n \sum_{j=0}^n \mathbf{w}_i \times \mathbf{M}_{ij} \times \mathbf{w}_j\\ &= \sum_{i=0}^n \sum_{j=0}^n \mathbf{v}_i \times \left(\mathbf{A} \times \mathbf{B}\right)_{ij} \times \mathbf{v}_j = \mathbf{v} \times \left(\mathbf{A} \times \mathbf{B}\right) \times \mathbf{v} \leqslant 0 \end{align*} \]
and so \(\mathbf{M}\) is not positive definite either, contradicting our assumption that it was, and so the terms inside the square roots must be positive for positive definite matrices.

Now, I must admit at this point that I have been slightly dishonest; it's also always possible to represent a positive semidefinite matrix as the product of a lower triangular matrix and its transpose, but there's more than one way to do so with more significant differences between them than simple column negations and, perhaps most importantly, they cannot be found with this algorithm and so we shall restrict ourselves to the positive definite case.

The Cholesky Decomposition

If we always choose the positive square root for the leading diagonal elements of the lower triangular matrices, then the representation of positive definite symmetric real matrices as the product of them and their upper triangular transposes is unique and is known as the Cholesky decomposition. Like the Jacobi decomposition, it's useful for solving systems of simultaneous equations, inverting matrices and computing their determinants.
The last of these follows from the facts that the determinant of the product of two matrices is equal to the product of their determinants and that, as noted above, the determinants of triangular matrices are equal to the products of the elements on their leading diagonals. Putting these together, for
\[ \mathbf{M} = \mathbf{L} \times \mathbf{L}^\mathrm{T} \]
we can calculate the determinant with
\[ |\mathbf{M}| = \prod_i \mathbf{L}_{ii}^2 \]
That noted, the real benefit of the Cholesky decomposition stems from how easy it is to solve simultaneous linear equations involving triangular matrices, a fact that we shall exploit in our implementation of it in the next post.


[1] Conquering The Eigen,, 2014.

[2] The Matrix Recoded,, 2014.

[3] The Matrix Isn't Real,, 2015.

Leave a comment