I Still Haven't Found What I'm Looking For

Last time we took a look at a selection of sorting operations that we can use to sort arrays, or ranges of elements within them. After defining some useful comparison functions satisfying JavaScript's requirement of returning a negative number when the first argument compares smaller than the second, zero when they compare equal and a positive number otherwise, and a function to map negative integers to indices read from the end of arrays in the same way that Array.slice does, we first implemented ak.partition which divides elements into two ranges; those elements that satisfy some given condition followed by those elements that don't. We saw how this could be used to implement the quicksort algorithm but instead defined ak.sort to sort a range of elements using Array.sort, slicing them out beforehand and splicing them back in again afterwards if they didn't represent whole arrays. We did use it, however, to implement ak.nthElement which puts a the correctly sorted element in a given position position within a range, putting before it elements that are no greater and after it elements that are no smaller. Finally, we implemented ak.partialSort which puts every element in a range up to, but not including, a given position into its correctly sorted place with all of the elements from that position onwards comparing no less than the last correctly sorted element.
This time we shall take a look at some of the ways that we can query data after we have manipulated it with these functions.

Full text...

submit to reddit  

We're All Sorted From A To Z

Something that I miss when programming in JavaScript is the wide variety of array manipulation functions available in my primary language, C++. We have, in fact, already implemented one of them with ak.shuffle which randomly rearranges the elements of an array. We shall be needing another one of them in the not too distant future and so I have decided to take a short break from numerical computing to add those of them that I use the most frequently to the ak library, starting with a selection of sorting operations.

Full text...

submit to reddit  

Do The Evolution

In the last few posts we have taken a look at genetic algorithms, which use simple models of biological evolution to search for global maxima of functions, being those points at which they return their greatest possible values.
These models typically represent the arguments of the function as genes within the binary chromosomes of individuals whose fitnesses are the values of the function for those arguments, exchange genetic information between them with a crossover operator, make small random changes to them with a mutation operator and, most importantly, favour the fitter individuals in the population for reproduction into the next generation with a selection operator.
We used a theoretical analysis of a simple genetic algorithm to suggest improved versions of the crossover operator, as well as proposing more robust schemes for selection and the genetic encoding of the parameters.
In this post we shall use some of them to implement a genetic algorithm for the ak library.

Full text...

submit to reddit  

The Best Laid Schemata

We have seen how we can exploit a simple model of biological evolution, known as a genetic algorithm, to search for global maxima of functions, being those points at which they return their greatest values.
This model treated the function being optimised as a non-negative measure of the fitness of individuals to survive and reproduce, replacing negative results with zero, and represented their chromosomes with arrays of bits which were mapped onto its arguments by treating subsets of them as integers that were linearly mapped to floating point numbers with given lower and upper bounds. It simulated sexual reproduction by splitting pairs of the chromosomes of randomly chosen individuals at a randomly chosen position and swapping their bits from it to their ends, and mutations by flipping randomly chosen bits from the chromosomes of randomly chosen individuals. Finally, and most crucially, it set the probability that an individual would be copied into the next generation to its fitness as a proportion of the total fitness of the population, ensuring that that total fitness would tend to increase from generation to generation.
I concluded by noting that, whilst the resulting algorithm was reasonably effective, it had some problems that a theoretical analysis would reveal and that is what we shall look into in this post.

Full text...

submit to reddit  

It's All In The Genes

Last time we took a look at the simulated annealing global minimisation algorithm which searches for points at which functions return their least possible values and which drew its inspiration from the metallurgical process of annealing which minimises the energy state of the crystalline structure of metals by first heating and then slowly cooling them.
Now as it happens, physics isn't the only branch of science from which we can draw inspiration for global optimisation algorithms. For example, in biology we have the process of evolution through which the myriad species of life on Earth have become extraordinarily well adapted to their environments. Put very simply this happens because offspring differ slightly from their parents and differences that reduce the chances that they will survive to have offspring of their own are less likely to be passed down through the generations than those that increase those chances.
Noting that extraordinarily well adapted is more or less synonymous with near maximally adapted, it's not unreasonable to suppose that we might exploit a mathematical model of evolution to search for global maxima of functions, being those points at which they return their greatest possible values.

Full text...

submit to reddit  

Annealing Down

A few years ago we saw how we could search for a local minimum of a function, being a point for which it returns a lesser value than any in its immediate vicinity, by taking random steps and rejecting those that lead uphill; an algorithm that we dubbed the blindfolded hill climber. Whilst we have since seen that we could progress towards a minimum much more rapidly by choosing the directions in which to step deterministically, there is a way that we can use random steps to yield better results.

Full text...

submit to reddit  

Copulating Normally

Last year we took a look at multivariate uniformly distributed random variables, which generalise uniform random variables to multiple dimensions with random vectors whose elements are independently uniformly distributed. We have now seen how we can similarly generalise normally distributed random variables with the added property that the normally distributed elements of their vectors may be dependent upon each other; specifically that they may be correlated.
As it turns out, we can generalise this dependence to arbitrary sets of random variables with a fairly simple observation.

Full text...

submit to reddit  

The Cumulative Distribution Unction

We have previously seen how we can generalise normally distributed random variables to multiple dimensions by defining vectors with elements that are linear functions of independent standard normally distributed random variables, having means of zero and standard deviations of one, with

  Z' = L × Z + μ

where L is a constant matrix, Z is a vector whose elements are the independent standard normally distributed random variables and μ is a constant vector.
So far we have derived and implemented the probability density function and the characteristic function of the multivariate normal distribution that governs such random vectors but have yet to do the same for its cumulative distribution function since it's a rather more difficult task and thus requires a dedicated treatment, which we shall have in this post.

Full text...

submit to reddit  

Multiple Multiply Normal Functions

Last time we took a look at how we could define multivariate normally distributed random variables with linear functions of multiple independent standard univariate normal random variables.
Specifically, given a Z whose elements are independent standard univariate normal random variables, a constant vector μ and a constant matrix L

  Z' = L × Z + μ

has linearly dependent normally distributed elements, a mean vector of μ and a covariance matrix of

  Σ' = L × LT

where LT is the transpose of L in which the rows and columns are switched.
We got as far as deducing the characteristic function and the probability density function of the multivariate normal distribution, leaving its cumulative distribution function and its complement aside until we'd implemented both them and the random variable itself, which we shall do in this post.

Full text...

submit to reddit  

Every Which Way Is Normal

A few months ago we saw how we could generalise the concept of a random variable to multiple dimensions by generating random vectors rather than numbers. Specifically we took a look at the multivariate uniform distribution which governs random vectors whose elements are independently uniformly distributed.
Whilst it demonstrated that we can find multivariate versions of distribution functions such as the probability density function, the cumulative distribution function and the characteristic function, the uniform distribution is fairly trivial and so, for a more interesting example, this time we shall look at generalising the normal distribution to multiple dimensions.

Full text...

submit to reddit