Mixing It Up

| 0 Comments

Last year we took a look at basis function interpolation[1] which fits a weighted sum of \(n\) independent functions, known as basis functions, through observations of an arbitrary function's values at a set of \(n\) points in order to approximate it at unobserved points. In particular, we saw that symmetric probability density functions, or PDFs, make reasonable basis functions for approximating both univariate and multivariate functions.
It is quite tempting, therefore, to use weighted sums of PDFs to construct new PDFs and in this post we shall see how we can use a simple probabilistic argument to do so.

A Dice Game

Suppose that we have a set of six polyhedral dice having twenty, twelve, ten, eight, six and four sides respectively and that we choose which one to cast with the result of the roll of another six sided die. Labelling the outcome as \(K\), the probabilities that it will take particular values between one and twenty are trivially given by
\[ \begin{align*} \Pr(K=k | 1 \leqslant k \leqslant 4) &= \tfrac16 \times \tfrac1{20} + \tfrac16 \times \tfrac1{12} + \tfrac16 \times \tfrac1{10} + \tfrac16 \times \tfrac18 + \tfrac16 \times \tfrac16 + \tfrac16 \times \tfrac14 = \tfrac{93}{720}\\ \Pr(K=k | 5 \leqslant k \leqslant 6) &= \tfrac16 \times \tfrac1{20} + \tfrac16 \times \tfrac1{12} + \tfrac16 \times \tfrac1{10} + \tfrac16 \times \tfrac18 + \tfrac16 \times \tfrac16 = \tfrac{63}{720}\\ \Pr(K=k | 7 \leqslant k \leqslant 8) &= \tfrac16 \times \tfrac1{20} + \tfrac16 \times \tfrac1{12} + \tfrac16 \times \tfrac1{10} + \tfrac16 \times \tfrac18 = \tfrac{43}{720}\\ \Pr(K=k | 9 \leqslant k \leqslant 10) &= \tfrac16 \times \tfrac1{20} + \tfrac16 \times \tfrac1{12} + \tfrac16 \times \tfrac1{10} = \tfrac{28}{720}\\ \Pr(K=k | 11 \leqslant k \leqslant 12) &= \tfrac16 \times \tfrac1{20} + \tfrac16 \times \tfrac1{12} = \tfrac{16}{720}\\ \Pr(K=k | 13 \leqslant k \leqslant 20) &= \tfrac16 \times \tfrac1{20} = \tfrac6{720} \end{align*} \]
where the vertical bar means given. We can double check the arithmetic by calculating the probability that it will lie between one and twenty with
\[ \begin{align*} \Pr(1 \leqslant K \leqslant 20) &= 4 \times \tfrac{93}{720} + 2 \times \tfrac{63}{720} + 2 \times \tfrac{43}{720} + 2 \times \tfrac{28}{720} + 2 \times \tfrac{16}{720} + 8 \times \tfrac6{720}\\ &= \tfrac{372}{720} + \tfrac{126}{720} + \tfrac{86}{720} + \tfrac{56}{720} + \tfrac{32}{720} + \tfrac{48}{720} = \tfrac{720}{720} = 1 \end{align*} \]
Figure 1 plots the probability mass function, or PMF, and the cumulative distribution function, or CDF, of \(K\).

Figure 1: The Dice Game's PMF And CDF
  

If we label the outcome of an \(n\) sided die as \(K_n\) then rearranging yields
\[ \begin{align*} \Pr(K=k) &= \tfrac16 \times \Pr\left(K_{20}=k\right) + \tfrac16 \times \Pr\left(K_{12}=k\right) + \tfrac16 \times \Pr\left(K_{10}=k\right)\\ &+ \tfrac16 \times \Pr\left(K_8=k\right) + \tfrac16 \times \Pr\left(K_6=k\right) + \tfrac16 \times \Pr\left(K_4=k\right) \end{align*} \]
and by the linearity of sums
\[ \begin{align*} \Pr(K \leqslant k) &= \tfrac16 \times \Pr\left(K_{20} \leqslant k\right) + \tfrac16 \times \Pr\left(K_{12} \leqslant k\right) + \tfrac16 \times \Pr\left(K_{10} \leqslant k\right)\\ &+ \tfrac16 \times \Pr\left(K_8 \leqslant k\right) + \tfrac16 \times \Pr\left(K_6 \leqslant k\right) + \tfrac16 \times \Pr\left(K_4 \leqslant k\right) \end{align*} \]

Mixology

This is a simple example of a general class of probability distributions, known as mixture distributions, defined by a set of \(n\) independent random variables \(X_i\), each having a probability of \(p_i\) of being selected for observation. The outcome trivially has a CDF of
\[ \begin{align*} F_X(x) = \Pr(X \leqslant x) = \sum_{i=0}^{n-1} p_i \times F_{X_i}(x) \end{align*} \]
where \(\sum\) is the summation sign and \(F_{X_i}\) is the CDF of \(X_i\). Differentiating with respect to \(x\) yields the PDF
\[ \begin{align*} f_X(x) = \sum_{i=0}^{n-1} p_i \times f_{X_i}(x) \end{align*} \]
where \(f_{X_i}\) is the PDF of \(X_i\), which is precisely the mechanism for constructing a PDF from a weighted sum of PDFs that we were looking for. In deriving it we have got the mixture random variable and CDF for free, and the characteristic function follows easily with
\[ \begin{align*} \hat{f}_X(t) = \mathrm{E}\left[e^{itX}\right] &= \int_{-\infty}^{\infty} e^{itx} \times f_X(x) \; \mathrm{d}x = \int_{-\infty}^{\infty} e^{itx} \times \sum_{j=0}^{n-1} p_j \times f_{X_j}(x) \; \mathrm{d}x\\ &= \sum_{j=0}^{n-1} p_j \times \int_{-\infty}^{\infty} e^{itx} \times f_{X_j}(x) \; \mathrm{d}x = \sum_{j=0}^{n-1} p_j \times \hat{f}_{X_j}(t) \end{align*} \]
where \(\mathrm{E}\) represents the expectation of the expression between the following brackets and \(i\) is the imaginary unit which is equal to the square root of minus one.

Mixture distributions are useful for describing random variables that have distinct modes of behaviour; for example stock market fluctuations that follow one distribution during normal times and another during a crisis.

The ak Mixture Distribution Functions

All of the mixture distribution functions require a set of component distribution functions and associated weights and so we can initialise them with a single constructor function, defined in listing 1.

Listing 1: Initialising Mixture Distribution Functions
function constructor(dists, weights) {
 var sum = 0;
 var n, i;

 if(ak.nativeType(dists)!==ak.ARRAY_T) {
  throw new Error('invalid distributions in ak.mixture distribution');
 }
 if(ak.nativeType(weights)!==ak.ARRAY_T) {
  throw new Error('invalid weights in ak.mixture distribution');
 }

 n = dists.length;
 if(n===0) throw new Error('no distributions in ak.mixture distribution');
 if(weights.length!==n) throw new Error('size mismatch ak.mixture distribution');

 for(i=0;i<n;++i) {
  if(ak.nativeType(dists[i])!==ak.FUNCTION_T) {
   throw new Error('invalid distribution in ak.mixture distribution');
  }
  if(!(ak.nativeType(weights[i])===ak.NUMBER_T && weights[i]>0)) {
   throw new Error('invalid weight in ak.mixture distribution');
  }
  sum += weights[i];
 }
 if(!isFinite(sum)) throw new Error('non-finite weights in ak.mixture distribution');
 return {
  dists: dists.slice(0),
  weights: weights.slice(0),
  sum: sum
 };
}

Note that, whilst requiring that the distribution functions are indeed functions, we only require that the weights are positive and finite since we shall use a random fraction of the sum to select which of them to observe.

The implementation of the PDF is given by ak.mixturePDF in listing 1.

Listing 2: The Mixture PDF
function pdf(pdfs, weights, sum, x) {
 var n = pdfs.length;
 var p = 0;
 var i;

 for(i=0;i<n;++i) p += weights[i] * pdfs[i](x);
 return Math.max(p/sum, 0);
}

ak.mixturePDF = function(pdfs, weights) {
 var state = constructor(pdfs, weights);
 var f = function(x) {return pdf(state.dists, state.weights, state.sum, x);};
 f.pdfs = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

This simply creates a function that forwards to the pdf helper function, which calculates the weighted sum of the component PDFs and truncates it below to zero, then adds property accessors for them and their weights before returning it. Program 1 demonstrates its use for a mixture of two normal distribution.

Program 1: A Mixture PDF

The ak.mixtureCDF function defined in listing 3 implements the CDF in an almost identical way, the principal differences being that it is initialised with CDFs and the result is clamped to a value between zero and one.

Listing 3: The Mixture CDF
function cdf(cdfs, weights, sum, x) {
 var n = cdfs.length;
 var c = 0;
 var i;

 for(i=0;i<n;++i) c += weights[i] * cdfs[i](x);
 return Math.min(Math.max(c/sum, 0), 1);
}

ak.mixtureCDF = function(cdfs, weights) {
 var state = constructor(cdfs, weights);
 var f = function(x) {return cdf(state.dists, state.weights, state.sum, x);};
 f.cdfs = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

Program 2 plots the CDF of the mixture of the same pair of normal distributions.

Program 2: A Mixture CDF

The implementation of the characteristic function, ak.mixtureCF, is given in listing 4.

Listing 4: The Mixture CF
function cf(cfs, weights, sum, t) {
 var n = cfs.length;
 var z = ak.complex(0);
 var i;

 for(i=0;i<n;++i) z = ak.add(z, ak.mul(weights[i], cfs[i](t)));
 return ak.div(z, sum);
}

ak.mixtureCF = function(cfs, weights) {
 var state = constructor(cfs, weights);
 var f = function(x) {return cf(state.dists, state.weights, state.sum, x);};
 f.cfs = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

Since the result of CFs are complex numbers we represent them with our ak.complex type and so must use our ak overloaded arithmetic operators to calculate their weighted sum. Program 3 plots its results for the same pair of normal distributions with the real part in black and the imaginary part in red.

Program 3: A Mixture CF

Making an observation of a mixture random variable takes a different form. Rather than calculating a weighted sum of the component random variables we use a uniformly distributed random variable to choose which of them to observe. The ak.mixtureRnd implementation given in listing 5 does so by multiplying the sum of the weights by the standard uniform random variable rnd, which defaults to Math.random, and then iteratively subtracts the components' weights from it, making an observation of the first for which the running result is less than its weight.

Listing 5: The Mixture Random Variable
function draw(rnds, weights, sum, rnd) {
 var n = rnds.length-1;
 var u = rnd()*sum;
 var i = 0;

 while(u>=weights[i] && i<n) u -= weights[i++];
 return rnds[i]();
}

ak.mixtureRnd = function(rnds, weights, rnd) {
 var state = constructor(rnds, weights);
 var t = ak.nativeType(rnd);
 var f;

 if(t===ak.UNDEFINED_T) rnd = Math.random;
 else if(t!==ak.FUNCTION_T) throw new Error('invalid rnd in ak.mixtureRnd');

 f = function(x) {return draw(state.dists, state.weights, state.sum, rnd);};
 f.rnds = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 f.rnd = function() {return rnd;};
 return Object.freeze(f);
};

Program 4 plots a histogram of observations of a mixture random variable against a suitably scaled graph of its PDF.

Program 4: A Mixture Random Variable

Unfortunately the inverse of a mixture CDF cannot be expressed in terms of its components' inverse CDFs and so we must resort to numerical approximation, which the ak.mixtureInvCDF function defined in listing 6 does using our ak.secantInverse to invert an ak.mixtureCDF function, adding the threshold argument for its optional convergence threshold.

Listing 6: The Mixture Inverse CDF
ak.mixtureInvCDF = function(cdfs, weights, threshold) {
 var cdf = ak.mixtureCDF(cdfs, weights);
 var f = ak.secantInverse(cdf, threshold);
 f.cdfs = cdf.cdfs;
 f.weights = cdf.weights;
 return Object.freeze(f);
};

Program 5 compares given probabilities to the CDFs of the results of the inverse CDF to demonstrate that the inverse is at least approximately correct.

Program 5: A Mixture Inverse CDF

Generalising The Mixture Distribution

Now there is nothing in the definition of a mixture distribution that requires its components to be continuous and so the ak.mixturePMF function defined in listing 7 adds support for discrete PMFs, ensuring that results are only non-zero for integer arguments.

Listing 7: The Mixture PMF
function pmf(pmfs, weights, sum, k) {
 var n = pmfs.length;
 var p = 0;
 var i;

 if(k!==ak.floor(k)) return isNaN(k) ? ak.NaN : 0;
 for(i=0;i<n;++i) p += weights[i] * pmfs[i](k);
 return Math.max(p/sum, 0);
}

ak.mixturePMF = function(pmfs, weights) {
 var state = constructor(pmfs, weights);
 var f = function(k) {return pmf(state.dists, state.weights, state.sum, k);};
 f.pmfs = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

Note that all but the ak.mixturePDF function are compatible with discrete distributions and so this is the only addition we need to make to fully support mixtures of them. Mixtures of discrete and continuous distributions are only partially supported since neither the PDF or PMF yield meaningful results.

Furthermore, there is no requirement that the distribution be univariate so we might as well add support for the functions that we have previously provided for multivariate distributions. Firstly we have the complementary CDF, being the probability that every element of a vector observation will exceed those of its argument, which for a mixture distribution is once again simply a weighted sum as implemented by ak.compCDF in listing 8.

Listing 8: The Mixture Complementary CDF
function compCDF(compCDFs, weights, sum, x) {
 var n = compCDFs.length;
 var c = 0;
 var i;

 for(i=0;i<n;++i) c += weights[i] * compCDFs[i](x);
 return Math.min(Math.max(c/sum, 0), 1);
}

ak.mixtureCompCDF = function(compCDFs, weights) {
 var state = constructor(compCDFs, weights);
 var f = function(x) {return compCDF(state.dists, state.weights, state.sum, x);};
 f.compCDFs = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

Finally, lacking unique inverses for multivariate distributions, we provided functions to map from vectors of values between zero and one to observations of their random variables. In the case of mixture distributions we need to add an additional value between zero and one to select which of the component distributions to map, which ak.mixtureMap does with the argument u for the latter and v for the former.

Listing 9: The Mixture Map
function map(maps, weights, sum, u, v) {
 var n = maps.length-1;
 var i = 0;

 u *= sum;
 while(u>=weights[i] && i<n) u -= weights[i++];
 return maps[i](v);
}

ak.mixtureMap = function(maps, weights) {
 var state = constructor(maps, weights);
 var f = function(u, v) {return map(state.dists, state.weights, state.sum, u, v);};
 f.maps = function() {return state.dists.slice(0);};
 f.weights = function() {return state.weights.slice(0);};
 return Object.freeze(f);
};

And with that we are finished with our implementation of mixture distributions, apart from mentioning that their functions can be found in MixtureDistribution.js.

References

[1] All Your Basis Are Belong To Us, www.thusspakeak.com, 2020.

[2] Every Secant Counts, www.thusspakeak.com, 2014.

Leave a comment