This model treated the function being optimised as a non-negative measure of the fitness of individuals to survive and reproduce, replacing negative results with zero, and represented their chromosomes with arrays of bits which were mapped onto its arguments by treating subsets of them as integers that were linearly mapped to floating point numbers with given lower and upper bounds. It simulated sexual reproduction by splitting pairs of the chromosomes of randomly chosen individuals at a randomly chosen position and swapping their bits from it to their ends, and mutations by flipping randomly chosen bits from the chromosomes of randomly chosen individuals. Finally, and most crucially, it set the probability that an individual would be copied into the next generation to its fitness as a proportion of the total fitness of the population, ensuring that that total fitness would tend to increase from generation to generation.

I concluded by noting that, whilst the resulting algorithm was reasonably effective, it had some problems that a theoretical analysis would reveal and that is what we shall look into in this post. ]]>

Now as it happens, physics isn't the only branch of science from which we can draw inspiration for global optimisation algorithms. For example, in biology we have the process of evolution through which the myriad species of life on Earth have become extraordinarily well adapted to their environments. Put very simply this happens because offspring differ slightly from their parents and differences that reduce the chances that they will survive to have offspring of their own are less likely to be passed down through the generations than those that increase those chances.

Noting that

As it turns out, we can generalise this dependence to arbitrary sets of random variables with a fairly simple observation. ]]>

where

So far we have derived and implemented the probability density function and the characteristic function of the multivariate normal distribution that governs such random vectors but have yet to do the same for its cumulative distribution function since it's a rather more difficult task and thus requires a dedicated treatment, which we shall have in this post. ]]>

Specifically, given a

has linearly dependent normally distributed elements, a mean vector of

where

We got as far as deducing the characteristic function and the probability density function of the multivariate normal distribution, leaving its cumulative distribution function and its complement aside until we'd implemented both them and the random variable itself, which we shall do in this post. ]]>

Whilst it demonstrated that we can find multivariate versions of distribution functions such as the probability density function, the cumulative distribution function and the characteristic function, the uniform distribution is fairly trivial and so, for a more interesting example, this time we shall look at generalising the normal distribution to multiple dimensions. ]]>

We have also seen how quasi random sequences fill areas more evenly than pseudo random sequences and so you might be asking yourself whether we could do better by using the former rather than the latter to approximate integrals.

Clever you! ]]>

When solving integrals of multivariate functions mathematically we typically integrate over each argument in turn, treating the others as constants as we do so. At each step we remove one argument from the problem and so must eventually run out of them, at which point we will have solved it.

It is consequently extremely tempting to approximate multivariate integrals by recursively applying univariate numerical integration algorithms to each argument in turn. ]]>

Specifically, if we join two points

In regions where a function is rapidly changing we need a lot of trapeziums to accurately approximate its integral. If we restrict ourselves to trapeziums of equal width, as we have done so far, this means that we might spend far too much effort putting trapeziums in regions where a function changes slowly if it also has regions where it changes quickly.

The obvious solution to this is, of course, to use trapeziums of

Now it will occasionally be the case that we should prefer to use quasi random sequences in place of pseudo random number generators and to do so we shall need to present an interface to the former that exactly matches that of the latter. ]]>

At the risk of being tedious, there are two more sequences that I'd like to cover before we move on to a new topic. ]]>

Whilst the implementations of arithmetic and geometric sequences and their associated series were somewhat subtle, they're not particularly interesting from a mathematical perspective and so this time we'll take a look at something a little more complicated. ]]>