`ak.minHeap`

implementation of the min-heap structure to cache the distances between clusters, saving us the expense of recalculating them for clusters that don't change from one step in the hierarchy to the next.Recall that we used three different schemes for calculating the distance between a pair of clusters, the average distance between their members, known as average linkage, the distance between their closest members, known as single linkage, and the distance between their farthest members, known as complete linkage, and that I concluded by noting that our algorithm was about as efficient as possible in general but that there is a much more efficient scheme for single linkage clusterings; efficient enough that sorting the clusters in each clustering by size would be the most costly operation and so in this post we shall implement objects to represent clusterings that don't do that. ]]>

Splendid! Come join me at my table!

I propose a game played as a religious observance by the parishioners of the United Reformed Eighth-day Adventist Church of Cthulhu, the eldritch octopus god that lies dead but dreaming in the drowned city of Hampton-on-Sea.

Several years ago, the Empress directed me to pose as a peasant and infiltrate their temple of Fhtagn in the sleepy village of Saint Reatham on the Hill when it was discovered that Bishop Derleth Miskatonic had been directing his congregation to purchase vast tracts of land in the Ukraine and gift them to the church in return for the promise of being spared when Cthulhu finally wakes and devours mankind. ]]>

We did this by selecting the closest pairs of clusters in one clustering and merging them to create the next, using one of three different measures of the distance between a pair of clusters; the average distance between their members, the distance between their nearest members and the distance between their farthest members, known as average linkage, single linkage and complete linkage respectively.

Unfortunately our implementation came in at a rather costly

Upon hearing these rules I recognised that they described the classic probability problem known as Pólya's Urn. I explained to the Baron that it admits a relatively simple expression that governs the likelihood that the bag contains given numbers of black and white tokens at each turn which could be used to figure the probability that he should have triumphed, but I fear that he didn't entirely grasp my point. ]]>

We then went on to define the

`ak.clade`

type to represent hierarchical clusterings as trees, so named because that's what they're called in biology when they are used to show the relationships between species and their common ancestors.Now that we have those structures in place we're ready to see how to create hierarchical clusterings and so in this post we shall start with a simple, general purpose, but admittedly rather inefficient, way to do so. ]]>

Whilst Sir N----- showed that a pair of bodies traversed conic sections under gravity, being those curves that arise from the intersection of planes with cones, the general case of several bodies has proved utterly resistant to mathematical reckoning. We must therefore approximate the equations of motion and I shall now report on our first attempt at doing so. ]]>

Note that both of these algorithms use a heuristic, or rule of thumb, to assign data to clusters, but there's another way to construct clusterings; define a heuristic to measure how close to each other a pair of clusters are and then, starting with each datum in a cluster of its own, progressively merge the closest pairs until we end up with a single cluster containing all of the data. This means that we'll end up with a sequence of clusterings and so before we can look at such algorithms we'll need a structure to represent them. ]]>

And, might I hope, for a little sport?

I should not have doubted it for a moment sir!

This fine weather reminds me of the time I spent as the Empress's trade envoy to the market city of Argos, famed almost as much for the remarkable, if somewhat fragile, mechanical contraptions made by its artificers and the most reasonably priced jewellery sold by its goldsmiths as for its fashion for tiny writing implements. ]]>

This time we shall take a look at a clustering algorithm that uses nearest neighbours to identify clusters, contrasting it with the

When the Baron described the manner of play to me I immediately pointed out to him that it was Penney-Ante, which I recognised because one of my fellow students had recently employed it to enjoy a night at the tavern entirely at the expense of the rest of us! He was able to do so because it's an example of an intransitive wager in which the second player can always contrive to make a choice that will best the first player's. ]]>

The

Now I'd like to introduce some more clustering algorithms but there are a few things that we'll need first. ]]>

In particular we have been curious as to whether we might construct such a model using nought but Sir N-----'s law of universal gravitation, which posits that those bodies are attracted to one another with a force that is proportional to the product of their masses divided by the square of the distance between them, and laws of motion, which posit that a body will remain at rest or move with constant velocity if no force acts upon it, that if a force acts upon it then it will be accelerated at a rate proportional to that force divided by its mass in the direction of that force and that it in return exerts a force of equal strength in the opposite direction. ]]>

`ak`

library, taking inspiration from the C++ standard library. Now, JavaScript isn't just short of algorithms in its standard library, it's also lacking all but a few data structures. In fact, from the programmer's perspective, it only has hash maps which use a function to map strings to non-negative integers that are then used as indices into an array of sets of key-value pairs. Technically speaking, even arrays are hash maps of the string representation of their indices, although in practice the interpreter will use more efficient data structures whenever it can.As clever as its implementation might be, it's unlikely that it will always figure out exactly what we're trying to do and pick the most appropriate data structure and so it will occasionally be worth explicitly implementing a data structure so that we can be

The first such data structure that we're going to need is a min-heap which will allow us to efficiently add elements in any order and remove them in ascending order, according to some comparison function. ]]>

And will you also join me in a wager whilst you let the fire chase the chill from your bones?

Fine fellow! Stout fellow!

I have in mind a game that reminds me of my raid upon the vault of Heaven, which I mounted in order to make amends to the Empress for my failure to snatch the Amulet of Yendor from the inner circle of Hell. ]]>

Like all copulas they are effectively the CDFs of vector valued random variables whose elements are uniformly distributed when considered independently. Whilst those Archimedean CDFs were relatively trivial to implement, we found that their probability density functions, or PDFs, were somewhat more difficult and that the random variables themselves required some not at all obvious mathematical manipulation to get right.

Having done all the hard work implementing the

`ak.archimedeanCopula`

, `ak.archimedeanCopulaDensity`

and `ak.archimedeanCopulaRnd`

functions we shall now use them to implement some specific families of Archimedean copulas.
]]>