Halton, Here's A Pseud

| 0 Comments

Last time[1] we came across the concept of quasi random sequences whose terms progressively fill intervals as evenly as possible, as contrasted with pseudo random number generators[2] which give equal likelihood to numbers within them being produced.
Now it will occasionally be the case that we should prefer to use quasi random sequences in place of pseudo random number generators and to do so we shall need to present an interface to the former that exactly matches that of the latter.

A Quasi Random Number Generator

The obvious way to do this is with a function that keeps track of which terms of the sequence have already been evaluated, as demonstrated by program 1.

Program 1: A Quasi Random Generator

Unfortunately, for short runs, the values returned by this generator follow arithmetic sequences. For example, the first ten terms generated by program 1 are
\[ \begin{align*} s_1 &= 0\\ s_2 &= 0.1\\ s_3 &= 0.2\\ s_4 &= 0.3\\ s_5 &= 0.4\\ s_6 &= 0.5\\ s_7 &= 0.6\\ s_8 &= 0.7\\ s_9 &= 0.8\\ s_{10} &= 0.9 \end{align*} \]
or, in other words
\[ s_i = (i-1) \times 0.1 \]
Similarly, the next ten terms are
\[ \begin{align*} s_{11} &= 0.01\\ s_{12} &= 0.11\\ s_{13} &= 0.21\\ s_{14} &= 0.31\\ s_{15} &= 0.41\\ s_{16} &= 0.51\\ s_{17} &= 0.61\\ s_{18} &= 0.71\\ s_{19} &= 0.81\\ s_{20} &= 0.91 \end{align*} \]
which follow the sequence
\[ s_i = 0.01 + (i-11) \times 0.1 \]
Fortunately we can easily fix this by shuffling the order of the digits in any given base.

Do The Shuffle

Derivation 1: Never Been Picked
Firstly, the probability that the first card is not selected for a single swap is trivially
\[ \frac{n-2}{n} = 1 - \frac{2}{n} \]
After \(n\) swaps, the probability that it was never selected is therefore
\[ \left(1 - \frac{2}{n}\right)^n \]
Now, the original definition of the exponential function was
\[ e^x = \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \]
from which we can see that the limit of the probability that it's never selected, as \(n\) tends to infinity, is \(e^{-2}\). Finally
\[ \begin{align*} e^{-2} & \approx 0.135\\ \frac18 & = 0.125 \end{align*} \]
A simple scheme that we could use to shuffle an array of \(n\) elements is to repeatedly select two different elements at random and swap them, and perhaps \(n\) such swaps would seem to be an appealing choice.

Unfortunately simplistic might be a more accurate description given that this scheme is completely unfit for purpose. Specifically, as \(n\) increases the probability that any particular element, say the first for example, will never be swapped tends to greater than one chance in eight, as proven in derivation 1.
Now, the number of ways in which we can arrange \(n\) elements is \(n!\), where the exclamation mark stands for the factorial, the product of every number from one to \(n\). This means that the proportion of rearrangements of \(n\) elements in which the first stays in the same place must be
\[ \frac{(n-1)!}{n!} = \frac{(n-1)!}{n \times (n-1)!} =\frac{1}{n} \]
which most certainly won't tend to more than one eighth as \(n\) increases!

Whilst we could improve this scheme by performing very many more than \(n\) swaps, there's a much better way to do the shuffle.

The trick is to pick an element at random to be the first of the shuffled elements, then a different one to be the second, then another to be the third and so on. This way, every element has an equal chance of ending up in each position. What's more, this can be done in-place without creating a new array for the shuffled elements.
To do this, we begin by swapping the first element with one chosen at random from all of the elements, including itself. Next, we swap the second element with one chosen a random from all but the first. Then we swap the third element with one chosen a random from all but the first and second. And so on.

This is the algorithm used by the ak.shuffle function given in listing 1.

Listing 1: ak.shuffle
ak.shuffle = function(a, start, end, rnd) {
 var n, i, j, c;

 if(ak.nativeType(a)!==ak.ARRAY_T) {
  throw new Error('invalid array in ak.shuffle');
 }
 n = a.length;

 if(ak.nativeType(start)===ak.UNDEFINED_T) {
  start = 0;
 }
 else if(start!==ak.floor(start)) {
  throw new Error('invalid start in ak.shuffle');
 }

 if(ak.nativeType(end)===ak.UNDEFINED_T) {
  end = n;
 }
 else if(end!==ak.floor(end)) {
  throw new Error('invalid end in ak.shuffle');
 }

 if(ak.nativeType(rnd)===ak.UNDEFINED_T) {
  rnd = Math.random;
 }
 else if(ak.nativeType(rnd)!==ak.FUNCTION_T) {
  throw new Error('invalid generator in ak.shuffle');
 }

 if(start<=-n)    start = 0;
 else if(start<0) start += n;
 else if(start>n) start = n;

 if(end<=-n)    end = 0;
 else if(end<0) end += n;
 else if(end>n) end = n;

 for(i=start;i<end-1;++i) {
  j = i+ak.floor(rnd()*(end-i));
  c = a[j];
  a[j] = a[i];
  a[i] = c;
 }
};

So the first thing that this function does is check that the first argument is actually an array. The remaining arguments are an optional start index, end index and random number generator which, if provided, must be an integer, another integer and a function and, if not, default to zero, the length of the array and Math.random respectively.
Here we're following the conventions of Array.slice for the start and end indices of the shuffle. Specifically, start is the index of the first element to be shuffled and end is the index one past the last element. Furthermore, negative indices count back from the end of the array and indices outside the bounds of the array are replaced with either zero or its length depending on whether they're before its start or after its end.
Once the conditionals implementing these conventions are out of the way, we have the loop implementing the algorithm itself. Within it, j is the randomly chosen index of an element that hasn't already been shuffled and its selection is followed by the swap of the elements. Note that we don't actually need to do anything with the last element of the shuffle since there's nothing after it that we might swap it with!

Program 2 demonstrates ak.shuffle, which is defined in Shuffle.js.

Program 2: ak.shuffle

A Better Quasi Random Number Generator

Now we have to be a little bit careful about exactly how we shuffle the digits of a base since we're implicitly transforming leading zeros of the Halton sequence's indices into trailing zeros of its values. The easiest way to avoid repeating values by explicitly putting trailing zeros after them is to only shuffle the non-zero digits of the base, as done by the shuffle function in listing 2.

Listing 2: Shuffling The Digits
function shuffle(base, rnd) {
 var ord, i;

 if(!isFinite(base) || base!==ak.floor(base) || base<=1) {
  throw new Error('invalid base in ak.haltonRnd');
 }

 ord = new Array(base);
 for(i=0;i<base;++i) ord[i] = i;
 ak.shuffle(ord, 1, base, rnd);
 return ord;
}

Unfortunately, we can't use ak.haltonSequence to implement a generator that shuffles the digits since it doesn't support that behaviour. It's not such an onerous task to write a new function that does though, as illustrated by listing 3.

Listing 3: ak.haltonRnd
function uniHalton(base, rnd) {
 var n = 0;
 var ord = shuffle(base, rnd);

 return function() {
  var i = ++n;
  var bn = 1;
  var x = 0;

  while(i>0) {
   bn *= base;
   x += ord[i%base]/bn;
   i = ak.floor(i/base);
  }
  return x;
 };
}

ak.haltonRnd = uniHalton;

Note that to avoid always starting with a value of zero, the underlying sequence of ak.haltonRnd starts at an index of one. Program 3 demonstrates how this muddles up the generated values for a base of ten whilst still drawing the tenths first and the hundredths second and so on.

Program 3: ak.haltonRnd

If you are of an inquisitive bent, you might be wondering why on Earth I've bothered with the uniHalton function if it just immediately gets assigned to ak.haltonRnd. The reason for this is that we shall frequently, by which I mean typically, want to quasi randomly generate arrays using different shuffled Halton sequences for each element.
The multiHalton function given in listing 4 does just that by creating an array of uniHalton functions initialised with the elements of the bases array. Generating the quasi random arrays is then simply a matter of performing an array map that calls the functions in that array.

Listing 4: ak.haltonRnd Again
function f(h) {
 return h();
}

function multiHalton(bases, rnd) {
 var n = bases.length;
 var h = new Array(n);
 var i;

 for(i=0;i<n;++i) h[i] = uniHalton(bases[i], rnd);

 return function() {
  return h.map(f);
 };
}

ak.haltonRnd = function(base, rnd) {
 return ak.nativeType(base)===ak.ARRAY_T ? multiHalton(base, rnd)
                                         : uniHalton(base, rnd);
};

Note that the result of ak.haltonRnd now switches between multiHalton and uniHalton depending upon whether or not its base argument is an array. Oh, and it's defined in HaltonRnd.js.

Now, if the bases of a set of Halton quasi random number generators are coprime, meaning that they can't both be exactly divided by any positive integer other than one, then they will evenly fill up a space of points. The easiest way to ensure this is to pick a set of different primes for the bases and, as luck would have it, we already have a supremely convenient way of generating them in ak.primeSequence.
Program 4 demonstrates how a two dimensional Halton quasi random number generator evenly fills up the unit square with points.

Program 4: A Two Dimensional ak.haltonRnd

Contrast this with points generated using our implementation of the Mersenne Twister[3], ak.mtRnd, as done in program 5.

Program 5: As Compared To ak.mtRnd

Whilst the latter are undeniably better approximations of truly random points, they're clearly not as evenly spread out as the former!

References

[1] A Few Sequences More, www.thusspakeak.com, 2016.

[2] Pseudo Algorithmical, www.thusspakeak.com, 2016.

[3] Let's Twist Again, www.thusspakeak.com, 2016.

Leave a comment