a.k. tag:www.thusspakeak.com,2013-04-25:/ak//9 2020-01-04T02:32:25Z Movable Type 5.2.13 A Well Managed Household tag:www.thusspakeak.com,2020:/ak//9.310 2020-01-03T20:00:00Z 2020-01-04T02:32:25Z a.k. Over the last few months we have seen how we can use a sequence of Householder transformations followed by a sequence of shifted Givens rotations to efficiently find the spectral decomposition of a symmetric real matrix M, formed from a matrix V and a diagonal matrix Λ satisfying

M × V = V × Λ

implying that the columns of V are the unit eigenvectors of M and their associated elements on the diagonal of Λ are their eigenvalues so that

V × VT = I

where I is the identity matrix, and therefore

M = V × Λ × VT

From a mathematical perspective the combination of Householder transformations and shifted Givens rotations is particularly appealing, converging on the spectral decomposition after relatively few matrix multiplications, but from an implementation perspective using `ak.matrix` multiplication operations is less than satisfactory since it wastefully creates new `ak.matrix` objects at each step and so in this post we shall start to see how we can do better. ]]>
Spryer Francis tag:www.thusspakeak.com,2019:/ak//9.309 2019-12-06T20:00:00Z 2020-01-16T06:51:26Z a.k. Last time we saw how we could use a sequence of Householder transformations to reduce a symmetric real matrix M to a symmetric tridiagonal matrix, having zeros everywhere other than upon the leading, upper and lower diagonals, which we could then further reduce to a diagonal matrix Λ using a sequence of Givens rotations to iteratively transform the elements upon the upper and lower diagonals to zero so that the columns of the accumulated transformations V were the unit eigenvectors of M and the elements on the leading diagonal of the result were their associated eigenvalues, satisfying

M × V = V × Λ

and, since the transpose of V is its own inverse

M = V × Λ × VT

which is known as the spectral decomposition of M.
Unfortunately, the way that we used Givens rotations to diagonalise tridiagonal symmetric matrices wasn't particularly efficient and I concluded by stating that it could be significantly improved with a relatively minor change. In this post we shall see what it is and why it works. ]]>
FAO The Householder tag:www.thusspakeak.com,2019:/ak//9.308 2019-11-01T19:00:00Z 2019-11-01T20:22:28Z a.k. Some years ago we saw how we could use the Jacobi algorithm to find the eigensystem of a real valued symmetric matrix M, which is defined as the set of pairs of non-zero vectors vi and scalars λi that satisfy

M × vi = λi × vi

known as the eigenvectors and the eigenvalues respectively, with the vectors typically restricted to those of unit length in which case we can define its spectral decomposition as the product

M = V × Λ × VT

where the columns of V are the unit eigenvectors, Λ is a diagonal matrix whose ith diagonal element is the eigenvalue associated with the ith column of V and the T superscript denotes the transpose, in which the rows and columns of the matrix are swapped.
You may recall that this is a particularly convenient representation of the matrix since we can use it to generalise any scalar function to it with

f(M) = V × f(Λ) × VT

where f(Λ) is the diagonal matrix whose ith diagonal element is the result of applying f to the ith diagonal element of Λ.
You may also recall that I suggested that there's a more efficient way to find eigensystems and I think that it's high time that we took a look at it. ]]>
A Little Bit Slinky tag:www.thusspakeak.com,2019:/ak//9.307 2019-10-04T19:00:00Z 2019-10-04T19:13:35Z For several months we've for been taking a look at cluster analysis which seeks to partition sets of data into subsets of similar data, known as clusters. Most recently we have focused our attention on hierarchical clusterings, which are sequences of sets of clusters in which pairs of data that belong to the same cluster at one step belong to the same cluster in the next step. A simple way of constructing them is to initially place each datum in its own cluster and then iteratively merge the closest pairs of clusters in each clustering to produce the next one in the sequence, stopping when all of the data belong to a single cluster. We have considered three ways of measuring the distance between pairs of clusters, the average distance between their members, the distance between their closest members and the distance between their farthest members, known as average linkage, single linkage and complete linkage respectively, and implemented a reasonably efficient algorithm for generating hierarchical clusterings defined with them, using a min-heap structure to cache the distances between clusters. Finally, I claimed that there is a more efficient algorithm for generating single linkage hierarchical clusterings that would make the sorting of clusters by size in our ak.clustering type too expensive and so last time we implemented the ak.rawClustering type to represent clusterings without sorting their clusters which we shall now use in the implementation of that algorithm. a.k. For several months we've for been taking a look at cluster analysis which seeks to partition sets of data into subsets of similar data, known as clusters. Most recently we have focused our attention on hierarchical clusterings, which are sequences of sets of clusters in which pairs of data that belong to the same cluster at one step belong to the same cluster in the next step.
A simple way of constructing them is to initially place each datum in its own cluster and then iteratively merge the closest pairs of clusters in each clustering to produce the next one in the sequence, stopping when all of the data belong to a single cluster. We have considered three ways of measuring the distance between pairs of clusters, the average distance between their members, the distance between their closest members and the distance between their farthest members, known as average linkage, single linkage and complete linkage respectively, and implemented a reasonably efficient algorithm for generating hierarchical clusterings defined with them, using a min-heap structure to cache the distances between clusters.
Finally, I claimed that there is a more efficient algorithm for generating single linkage hierarchical clusterings that would make the sorting of clusters by size in our `ak.clustering` type too expensive and so last time we implemented the `ak.rawClustering` type to represent clusterings without sorting their clusters which we shall now use in the implementation of that algorithm. ]]>
Cut Price Clusterings tag:www.thusspakeak.com,2019:/ak//9.306 2019-09-06T19:00:00Z 2019-09-06T19:07:37Z Last month we saw how we could efficiently generate hierarchical clusterings, which are sequences of sets of clusters, which are themselves subsets of a set of data that each contain elements that are similar to each other, such that if a pair of data are in the same clustering at one step then they must be in the same clustering in the next which will always be the case if we move from one step to the next by merging the closest pairs of clusters. Specifically, we used our ak.minHeap implementation of the min-heap structure to cache the distances between clusters, saving us the expense of recalculating them for clusters that don't change from one step in the hierarchy to the next. Recall that we used three different schemes for calculating the distance between a pair of clusters, the average distance between their members, known as average linkage, the distance between their closest members, known as single linkage, and the distance between their farthest members, known as complete linkage, and that I concluded by noting that our algorithm was about as efficient as possible in general but that there is a much more efficient scheme for single linkage clusterings; efficient enough that sorting the clusters in each clustering by size would be the most costly operation and so in this post we shall implement objects to represent clusterings that don't do that. a.k. Last month we saw how we could efficiently generate hierarchical clusterings, which are sequences of sets of clusters, which are themselves subsets of a set of data that each contain elements that are similar to each other, such that if a pair of data are in the same clustering at one step then they must be in the same clustering in the next which will always be the case if we move from one step to the next by merging the closest pairs of clusters. Specifically, we used our `ak.minHeap` implementation of the min-heap structure to cache the distances between clusters, saving us the expense of recalculating them for clusters that don't change from one step in the hierarchy to the next.
Recall that we used three different schemes for calculating the distance between a pair of clusters, the average distance between their members, known as average linkage, the distance between their closest members, known as single linkage, and the distance between their farthest members, known as complete linkage, and that I concluded by noting that our algorithm was about as efficient as possible in general but that there is a much more efficient scheme for single linkage clusterings; efficient enough that sorting the clusters in each clustering by size would be the most costly operation and so in this post we shall implement objects to represent clusterings that don't do that. ]]>
Racing Up The Hierarchy tag:www.thusspakeak.com,2019:/ak//9.304 2019-08-02T19:00:00Z 2019-08-02T19:11:01Z In the previous post we saw how to identify subsets of a set of data that are in some sense similar to each other, known as clusters, by constructing sequences of clusterings starting with each datum in its own cluster and ending with all of the data in the same cluster, subject to the constraint that if a pair of data are in the same cluster in one clustering then they must also be in the same cluster in the next, which are known as hierarchical clusterings. We did this by selecting the closest pairs of clusters in one clustering and merging them to create the next, using one of three different measures of the distance between a pair of clusters; the average distance between their members, the distance between their nearest members and the distance between their farthest members, known as average linkage, single linkage and complete linkage respectively. Unfortunately our implementation came in at a rather costly O(n3) operations and so in this post we shall look at how we can improve its performance. a.k. In the previous post we saw how to identify subsets of a set of data that are in some sense similar to each other, known as clusters, by constructing sequences of clusterings starting with each datum in its own cluster and ending with all of the data in the same cluster, subject to the constraint that if a pair of data are in the same cluster in one clustering then they must also be in the same cluster in the next, which are known as hierarchical clusterings.
We did this by selecting the closest pairs of clusters in one clustering and merging them to create the next, using one of three different measures of the distance between a pair of clusters; the average distance between their members, the distance between their nearest members and the distance between their farthest members, known as average linkage, single linkage and complete linkage respectively.
Unfortunately our implementation came in at a rather costly O(n3) operations and so in this post we shall look at how we can improve its performance. ]]>
A Place In The Hierarchy tag:www.thusspakeak.com,2019:/ak//9.303 2019-07-05T19:00:00Z 2019-07-05T19:06:37Z Last time we implemented the clusterings type to store a set of clustering objects in order to represent hierarchical clusterings, which are sequences of clusterings having the property that if a pair of data are in the same cluster in one clustering then they will be in the same cluster in the next, where clusters are subsets of a set of data that are in some sense similar to each other. We then went on to define the ak.clade type to represent hierarchical clusterings as trees, so named because that's what they're called in biology when they are used to show the relationships between species and their common ancestors. Now that we have those structures in place we're ready to see how to create hierarchical clusterings and so in this post we shall start with a simple, general purpose, but admittedly rather inefficient, way to do so. a.k. Last time we implemented the clusterings type to store a set of clustering objects in order to represent hierarchical clusterings, which are sequences of clusterings having the property that if a pair of data are in the same cluster in one clustering then they will be in the same cluster in the next, where clusters are subsets of a set of data that are in some sense similar to each other.
We then went on to define the `ak.clade` type to represent hierarchical clusterings as trees, so named because that's what they're called in biology when they are used to show the relationships between species and their common ancestors.
Now that we have those structures in place we're ready to see how to create hierarchical clusterings and so in this post we shall start with a simple, general purpose, but admittedly rather inefficient, way to do so. ]]>
Making The Clade tag:www.thusspakeak.com,2019:/ak//9.302 2019-06-07T19:00:00Z 2019-06-07T18:59:15Z We have so far seen a couple of schemes for identifying clusters in sets of data which are subsets whose members are in some sense similar to each other. Specifically, we have looked at the k means and the shared near neighbours algorithms which implicitly define clusters by the closeness of each datum to the average of each cluster and by their closeness to each other respectively. Note that both of these algorithms use a heuristic, or rule of thumb, to assign data to clusters, but there's another way to construct clusterings; define a heuristic to measure how close to each other a pair of clusters are and then, starting with each datum in a cluster of its own, progressively merge the closest pairs until we end up with a single cluster containing all of the data. This means that we'll end up with a sequence of clusterings and so before we can look at such algorithms we'll need a structure to represent them. a.k. We have so far seen a couple of schemes for identifying clusters in sets of data which are subsets whose members are in some sense similar to each other. Specifically, we have looked at the k means and the shared near neighbours algorithms which implicitly define clusters by the closeness of each datum to the average of each cluster and by their closeness to each other respectively.
Note that both of these algorithms use a heuristic, or rule of thumb, to assign data to clusters, but there's another way to construct clusterings; define a heuristic to measure how close to each other a pair of clusters are and then, starting with each datum in a cluster of its own, progressively merge the closest pairs until we end up with a single cluster containing all of the data. This means that we'll end up with a sequence of clusterings and so before we can look at such algorithms we'll need a structure to represent them. ]]>
Nearest And Dearest tag:www.thusspakeak.com,2019:/ak//9.301 2019-05-03T19:00:00Z 2019-05-03T20:52:30Z Last time we saw how we could use the list of the nearest neighbours of a datum in a set of clustered data to calculate its strengths of association with their clusters. Specifically, we used the k nearest neighbours algorithm to define those strengths as the proportions of its k nearest neighbours that were members of each cluster or with a generalisation of it that assigned weights to the neighbours according to their positions in the list. This time we shall take a look at a clustering algorithm that uses nearest neighbours to identify clusters, contrasting it with the k means clustering algorithm that we covered about four years ago. a.k. Last time we saw how we could use the list of the nearest neighbours of a datum in a set of clustered data to calculate its strengths of association with their clusters. Specifically, we used the k nearest neighbours algorithm to define those strengths as the proportions of its k nearest neighbours that were members of each cluster or with a generalisation of it that assigned weights to the neighbours according to their positions in the list.
This time we shall take a look at a clustering algorithm that uses nearest neighbours to identify clusters, contrasting it with the k means clustering algorithm that we covered about four years ago. ]]>
In The Neighbourhood tag:www.thusspakeak.com,2019:/ak//9.300 2019-04-05T19:00:00Z 2019-04-05T19:00:09Z A little under four years ago we saw how we could use the k means algorithm to divide sets of data into distinct subsets, known as clusters, whose members are in some sense similar to each other. The interesting thing about clustering is that even though we find it easy to spot clusters, at least in two dimensions, it's incredibly difficult to give them a firm mathematical definition and so clustering algorithms invariably define them implicitly as the subsets identified by this algorithm. The k means algorithm, for example, does so by first picking k different elements of the data as cluster representatives and then places every element in the cluster whose representative is nearest to it. The cluster representatives are then replaced by the means of the elements assign to it and the process is repeated iteratively until the clusters stop changing. Now I'd like to introduce some more clustering algorithms but there are a few things that we'll need first. a.k. A little under four years ago we saw how we could use the k means algorithm to divide sets of data into distinct subsets, known as clusters, whose members are in some sense similar to each other. The interesting thing about clustering is that even though we find it easy to spot clusters, at least in two dimensions, it's incredibly difficult to give them a firm mathematical definition and so clustering algorithms invariably define them implicitly as the subsets identified by this algorithm.
The k means algorithm, for example, does so by first picking k different elements of the data as cluster representatives and then places every element in the cluster whose representative is nearest to it. The cluster representatives are then replaced by the means of the elements assign to it and the process is repeated iteratively until the clusters stop changing.
Now I'd like to introduce some more clustering algorithms but there are a few things that we'll need first. ]]>
A Heap Of Stuff tag:www.thusspakeak.com,2019:/ak//9.280 2019-03-01T20:00:00Z 2019-03-01T19:44:28Z A little over a year ago we added a bunch of basic computer science algorithms to the ak library, taking inspiration from the C++ standard library. Now, JavaScript isn't just short of algorithms in its standard library, it's also lacking all but a few data structures. In fact, from the programmer's perspective, it only has hash maps which use a function to map strings to non-negative integers that are then used as indices into an array of sets of key-value pairs. Technically speaking, even arrays are hash maps of the string representation of their indices, although in practice the interpreter will use more efficient data structures whenever it can. As clever as its implementation might be, it's unlikely that it will always figure out exactly what we're trying to do and pick the most appropriate data structure and so it will occasionally be worth explicitly implementing a data structure so that we can be certain of its performance. The first such data structure that we're going to need is a min-heap which will allow us to efficiently add elements in any order and remove them in ascending order, according to some comparison function. a.k. A little over a year ago we added a bunch of basic computer science algorithms to the `ak` library, taking inspiration from the C++ standard library. Now, JavaScript isn't just short of algorithms in its standard library, it's also lacking all but a few data structures. In fact, from the programmer's perspective, it only has hash maps which use a function to map strings to non-negative integers that are then used as indices into an array of sets of key-value pairs. Technically speaking, even arrays are hash maps of the string representation of their indices, although in practice the interpreter will use more efficient data structures whenever it can.
As clever as its implementation might be, it's unlikely that it will always figure out exactly what we're trying to do and pick the most appropriate data structure and so it will occasionally be worth explicitly implementing a data structure so that we can be certain of its performance.
The first such data structure that we're going to need is a min-heap which will allow us to efficiently add elements in any order and remove them in ascending order, according to some comparison function. ]]>
Archimedean Crew tag:www.thusspakeak.com,2019:/ak//9.278 2019-02-01T20:00:00Z 2019-02-01T20:08:44Z a.k. We have recently seen how we can define dependencies between random variables with Archimedean copulas which calculate the probability that they each fall below given values by applying a generator function φ to the results of their cumulative distribution functions, or CDFs, for those values, and applying its inverse to their sum.
Like all copulas they are effectively the CDFs of vector valued random variables whose elements are uniformly distributed when considered independently. Whilst those Archimedean CDFs were relatively trivial to implement, we found that their probability density functions, or PDFs, were somewhat more difficult and that the random variables themselves required some not at all obvious mathematical manipulation to get right.
Having done all the hard work implementing the `ak.archimedeanCopula`, `ak.archimedeanCopulaDensity` and `ak.archimedeanCopulaRnd` functions we shall now use them to implement some specific families of Archimedean copulas. ]]>
Archimedean Review tag:www.thusspakeak.com,2019:/ak//9.276 2019-01-04T20:00:00Z 2019-01-04T20:02:24Z a.k. In the last couple of posts we've been taking a look at Archimedean copulas which define the dependency between the elements of vector values of a multivariate random variable by applying a generator function φ to the values of the cumulative distribution functions, or CDFs, of their distributions when considered independently, known as their marginal distributions, and applying the inverse of the generator to the sum of the results to yield the value of the multivariate CDF.
We have seen that the densities of Archimedean copulas are rather trickier to calculate and that making random observations of them is trickier still. Last time we found an algorithm for the latter, albeit with an implementation that had troubling performance and numerical stability issues, and in this post we shall add an improved version to the `ak` library that addresses those issues. ]]>
Archimedean View tag:www.thusspakeak.com,2018:/ak//9.274 2018-12-07T20:00:00Z 2018-12-07T20:02:24Z a.k. Last time we took a look at how we could define copulas to represent the dependency between random variables by summing the results of a generator function φ applied to the results of their cumulative distribution functions, or CDFs, and then applying the inverse of that function φ-1 to that sum.
These are known as Archimedean copulas and are valid whenever φ is strictly decreasing over the interval [0,1], equal to zero when its argument equals one and have nth derivatives that are non-negative over that interval when n is even and non-positive when it is odd, for n up to the number of random variables.
Whilst such copulas are relatively easy to implement we saw that their densities are a rather trickier job, in contrast to Gaussian copulas where the reverse is true. In this post we shall see how to draw random vectors from Archimedean copulas which is also much more difficult than doing so from Gaussian copulas. ]]>
Archimedean Skew tag:www.thusspakeak.com,2018:/ak//9.272 2018-11-02T19:00:00Z 2018-11-29T20:55:00Z About a year and a half ago we saw how we could use Gaussian copulas to define dependencies between the elements of a vector valued multivariate random variable whose elements, when considered in isolation, were governed by arbitrary cumulative distribution functions, known as marginals. Whilst Gaussian copulas are quite flexible, they can't represent every possible dependency between those elements and in this post we shall take a look at some others defined by the Archimedean family of copulas. a.k. About a year and a half ago we saw how we could use Gaussian copulas to define dependencies between the elements of a vector valued multivariate random variable whose elements, when considered in isolation, were governed by arbitrary cumulative distribution functions, known as marginals. Whilst Gaussian copulas are quite flexible, they can't represent every possible dependency between those elements and in this post we shall take a look at some others defined by the Archimedean family of copulas. ]]>