Will They Blend?

| 0 Comments

Last time[1] we saw how we can create new random variables from sets of random variables with given probabilities of observation. To make an observation of such a random variable we randomly select one of its components, according to their probabilities, and make an observation of it. Furthermore, their associated probability density functions, or PDFs, cumulative distribution functions, or CDFs, and characteristic functions, or CFs, are simply sums of the component functions weighted by their probabilities of observation.
Now there is nothing about such distributions, known as mixture distributions, that requires that the components are univariate. Given that copulas are simply multivariate distributions with standard uniformly distributed marginals, being the distributions of each element considered independently of the others, we can use the same technique to create new copulas too.

Compound Copulas

For this to be true we need three properties to be satisfied. Firstly, we need the resulting copula's density \(c\) to be non-negative for all vector arguments \(\mathbf{u}\) with no elements less than zero or greater than one. Secondly, if any element of \(\mathbf{u}\) is zero then the value of the copula \(C\) must be zero. Finally, if all of the elements of \(\mathbf{u}\) except \(u_j\) equal one then we require the value of the copula to equal \(u_j\).
Derivation 1 shows that these properties hold for mixture copulas.

Derivation 1: Mixture Copulas Are Copulas
For the first property we require that
\[ c(\mathbf{u}) = \sum_{i=0}^{n-1} p_i \times c_i(\mathbf{u}) \geqslant 0 \]
where \(c_i\) and \(p_i\) are the component densities and their associated probabilities, and \(\sum\) is the summation sign, which is trivially satisfied since both the component densities and their associated probabilities are non-negative.
For the second, we simply note that if any element of \(\mathbf{u}\) is zero then the values of the component copulas \(C_i\) must all equal zero.
For the third we have
\[ C(\mathbf{u}) = \sum_{i=0}^{n-1} p_i \times C_i(\mathbf{u}) = \sum_{i=0}^{n-1} p_i \times u_j = u_j \times \sum_{i=0}^{n-1} p_i = u_j \times 1 \]

For example, program 1 plots the results of an ak.mixturePDF using the ak.gaussianCopulaDensity[2] and ak.joeCopulaDensity[3] functions.

Program 1: A Mixture Copula Density

This represents a dependence that is usually well behaved and fairly negatively correlated but twenty percent of the time breaks down, becoming strongly positively correlated with a tendency towards the extremes, as illustrated by program 2 which plots observations of its random variable and those of its marginals to the left and beneath.

Program 2: A Mixture Copula Random Variable

We might try constructing a copula with the product of a pair of copulas. For example, the bivariate copula
\[ C\left(u_0, u_1\right) = C_0\left(u_0, u_1\right) \times C_1\left(u_0, u_1\right) \]
trivially satisfies the requirement that it equal zero when either argument is zero. Furthermore, its density is non-negative
\[ c\left(u_0, u_1\right) = \frac{\partial^2 C\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} \geqslant 0 \]
where \(\partial\) is the partial derivative, as proven in derivation 2.

Derivation 2: The Density Is Non-Negative
Firstly, by the product rule, we have
\[ \begin{align*} \frac{\partial C\left(u_0, u_1\right)}{\partial u_0} &= \frac{\partial C_0\left(u_0, u_1\right)}{\partial u_0} \times C_1\left(u_0, u_1\right) + C_0\left(u_0, u_1\right) \times \frac{\partial C_1\left(u_0, u_1\right)}{\partial u_0}\\ \frac{\partial^2 C\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} &= \frac{\partial^2 C_0\left(u_0, u_1\right)}{\partial u_0 \, \partial c_1} \times C_1\left(u_0, u_1\right) + \frac{\partial C_0\left(u_0, u_1\right)}{\partial u_0} \times \frac{\partial C_1\left(u_0, u_1\right)}{\partial c_1}\\ &+ \frac{\partial C_0\left(u_0, u_1\right)}{\partial c_1} \times \frac{\partial C_1\left(u_0, u_1\right)}{\partial u_0} + C_0\left(u_0, u_1\right) \times \frac{\partial^2 C_1\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} \end{align*} \]
The component copulas are non-negative, as are their densities and if the first partial derivative of \(C_0\) with respect to \(u_0\) were not for given values of \(u_0\) and \(u_1\) then, for some positive \(\delta_0\), we'd have
\[ \Pr\left(u_0 < U_0 \leqslant u_0+\delta_0 \,|\, U_1 = u_1\right) = C_0\left(u_0+\delta_0, u_1\right) - C_0\left(u_0, u_1\right) <0 \]
where the vertical bar means given and \(U_0\) and \(U_1\) are the first and second elements of the random variable. This cannot be true if \(C_0\) is a CDF and so, by the same argument, all of the first partial derivative terms must be non-negative.

Unfortunately, it doesn't have standard uniform marginals since
\[ C\left(u_0, 1\right) = C_0\left(u_0, 1\right) \times C_1\left(u_0, 1\right) = u_0^2 \]
If take the square root we'll recover the uniform marginals whilst preserving the value of zero when either argument is zero. Unfortunately, we can no longer guarantee that the density is non-negative, as shown by derivation 3.

Derivation 3: The Density Might Be Negative
By the chain rule, the density of the square root is given
\[ \begin{align*} \frac{\partial C^\frac12\left(u_0, u_1\right)}{\partial u_0} &= \tfrac12 \times C^{-\frac12}\left(u_0, u_1\right) \times \frac{\partial C\left(u_0, u_1\right)}{\partial u_0}\\ \frac{\partial^2 C^\frac12\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} = &-\tfrac14 \times C^{-\frac32}\left(u_0, u_1\right) \times \frac{\partial C\left(u_0, u_1\right)}{\partial u_1} \times \frac{\partial C\left(u_0, u_1\right)}{\partial u_0}\\ &+ \tfrac12 \times C^{-\frac12}\left(u_0, u_1\right) \times \frac{\partial^2 C\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} \end{align*} \]
which may be negative since the first term is non-positive.

If we instead define
\[ C^\prime\left(u_0, u_1\right) = C\left(u_0^\frac12, u_1^\frac12\right) = C_0\left(u_0^\frac12, u_1^\frac12\right) \times C_1\left(u_0^\frac12, u_1^\frac12\right) \]
then we trivially have
\[ \begin{align*} C^\prime\left(u_0, 0\right) &= C_0\left(u_0^\frac12, 0\right) \times C_1\left(u_0^\frac12, 0\right) = 0 \times 0\\ C^\prime\left(u_0, 1\right) &= C_0\left(u_0^\frac12, 1\right) \times C_1\left(u_0^\frac12, 1\right) = u_0^\frac12 \times u_0^\frac12 \end{align*} \]
Finally, as proven by derivation 4, we have
\[ c^\prime\left(u_0, u_1\right) = \frac{\partial^2 C^\prime\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} \geqslant 0 \]
and so \(C^\prime\) is a copula!

Derivation 4: The Density Is Positive Again!
Applying the chain rule again yields
\[ \begin{align*} \frac{\partial C^\prime\left(u_0, u_1\right)}{\partial u_0} &= \frac{\partial C\left(u_0^\frac12, u_1^\frac12\right)}{\partial u_0} = \tfrac12 \times u_0^{-\frac12} \times \frac{\partial C\left(u_0^\frac12, u_1^\frac12\right)}{\partial u_0^\frac12}\\ \frac{\partial^2 C^\prime\left(u_0, u_1\right)}{\partial u_0 \, \partial u_1} &= \tfrac14 \times u_0^{-\frac12} \times u_1^{-\frac12} \times \frac{\partial^2 C\left(u_0^\frac12, u_1^\frac12\right)}{\partial u_0^\frac12 \, \partial u_1^\frac12} \end{align*} \]
Since the density of \(C\) and the square roots of \(u_0\) and \(u_1\) are all non-negative so is the density of \(C^\prime\).

Note that we don't have to restrict ourselves to square roots but can instead define \(C^\prime\) as
\[ C^\prime\left(u_0, u_1\right) = C_0\left(u_0^{a_0}, u_1^{a_1}\right) \times C_1\left(u_0^{1-a_0}, u_1^{1-a_1}\right) \]
for \(a_0\) and \(a_1\) between zero and one, with the boundary conditions clearly being met and the density condition following by symmetry from the non-negativity of the first and second partial derivatives of \(C_0\)
\[ \begin{align*} \frac{\partial C_0\left(u_0^{a_0}, u_1^{a_1}\right)}{\partial u_0} &= a_0 \times u_0^{a_0-1} \times \frac{\partial C_0\left(u_0^{a_0}, u_1^{a_1}\right)}{\partial u_0^{a_0}}\\ \frac{\partial C_0\left(u_0^{a_0}, u_1^{a_1}\right)}{\partial u_0 \, \partial u_1} &= a_0 \times u_0^{a_0-1} \times a_1 \times u_1^{a_1-1} \times \frac{\partial C_0\left(u_0^{a_0}, u_1^{a_1}\right)}{\partial u_0^{a_0} \, \partial u_1^{a_1}} \end{align*} \]
Unfortunately we can't express the density \(c^\prime\) in terms of the component densities \(c_0^\prime\) and \(c_1^\prime\) and so program 3 plots it using a symmetric finite difference approximation for the same Gaussian and Gumbel copulas that we used in programs 1 and 2.

Program 3: The Density Of The Product

A Blend Of Multivariate Copulas

We can easily extend this to arbitrary numbers of dimensions with
\[ C^\prime\left(u_0, u_1, \dots, u_{n-1}\right) = \prod_{i=0}^{k-1} C_i\left(u_0^{\alpha_{i,0}}, u_1^{\alpha_{i,1}}, \dots, u_{n-1}^{\alpha_{i,n-1}}\right) \]
where \(\prod\) is the product sign, and
\[ \begin{align*} 0 \leqslant \alpha_{i,j} &\leqslant 1\\ \sum_{i=0}^{k-1} \alpha_{i,j} &= 1 \end{align*} \]
for \(j\) from zero to \(n-1\).
If any of the \(\alpha_{i,j}\) equal zero then we are effectively marginalising the \(i^\mathrm{th}\) copula to those elements for which they aren't, meaning that we can use products of copulas with differing numbers of arguments such as
\[ C^\prime\left(u_0, u_1, u_2\right) = C_0\left(u_0^{\alpha_{0,0}}, u_1^{\alpha_{0,1}}, u_2^{\alpha_{0,2}}\right) \times C_1\left(u_0^{\alpha_{1,0}}, u_1^{\alpha_{1,1}}\right) \times C_2\left(u_0^{\alpha_{2,0}}, u_2^{\alpha_{2,2}}\right) \times C_3\left(u_1^{\alpha_{3,1}}, u_2^{\alpha_{3,2}}\right) \]
provided that
\[ \begin{align*} \alpha_{0,0} + \alpha_{1,0} + \alpha_{2,0} &= 1\\ \alpha_{0,1} + \alpha_{1, 1} + \alpha_{3, 1} &= 1\\ \alpha_{0,2} + \alpha_{2, 2} + \alpha_{3, 2} &= 1 \end{align*} \]
Note that this is a generalisation of the Product of Bivariate Copulas, or PBC, copula[4].

Since each component copula has an associated set of argument indices and weights it's convenient to bind them together in an object, which we will do with the ak.blendCopulaElement type defined in listing 1.

Listing 1: A Blend Copula Element
ak.BLEND_COPULA_ELEMENT_T = 'ak.blendCopulaElement';

function BlendCopulaElement(){}
BlendCopulaElement.prototype = {TYPE: ak.BLEND_COPULA_ELEMENT_T,
                                valueOf: function(){return ak.NaN;}};

ak.blendCopulaElement = function(copula, ids, weights) {
 var e = new BlendCopulaElement();
 var n, sorted, i, max;

 if(ak.nativeType(copula)!==ak.FUNCTION_T) {
  throw new Error('invalid copula in ak.blendCopulaElement');
 }
 if(ak.nativeType(ids)!==ak.ARRAY_T) {
  throw new Error('invalid argument ids in ak.blendCopulaElement');
 }
 if(ak.nativeType(weights)!==ak.ARRAY_T) {
  throw new Error('invalid argument weights in ak.blendCopulaElement');
 }

 n = ids.length;
 if(weights.length!==n) {
  throw new Error('argument ids/weights size mismatch in ak.blendCopulaElement');
 }
 for(i=0;i<n;++i) {
  if(!(ids[i]>=0 && ids[i]!==ak.INFINITY && ak.floor(ids[i])===ids[i])) {
   throw new Error('invalid argument id in ak.blendCopulaElement');
  }
  if(!(ak.nativeType(weights[i])===ak.NUMBER_T && weights[i]>0
                                               && weights[i]!==ak.INFINITY)) {
   throw new Error('invalid argument weight in ak.blendCopulaElement');
  }
 }

 sorted = ids.slice(0);
 sorted.sort(function(x, y){return x-y;});
 for(i=1;i<n && sorted[i]!==sorted[i-1];++i);
 if(i<n) throw new Error('duplicate argument ids in ak.blendCopulaElement');
 max = n>0 ? sorted[n-1] : -1;

 ids = ids.slice(0);
 weights = weights.slice(0);

 e.copula  = function() {return copula;};
 e.args    = function() {return ids.length;};
 e.max     = function() {return max;};
 e.id      = function(i) {return ids[i];};
 e.weight  = function(i) {return weights[i];};
 return Object.freeze(e);
};

The greater part of its implementation is checking that the copula, its argument ids and weights are valid. Specifically that the former is a function and that the latter pair are arrays of finite, positive numbers. The ids are also required to be integers, which we check by comparing their type and value to their floor. We then check that the ids are unique by sorting a copy of them and comparing adjacent elements, saving their maximum.
Finally, we add access methods for the copula, the number of args, and finally for each id and weight by index, after the max of the former.

We shall initialise our blend copula functions by passing an array of ak.blendCopulaElement objects to the constructor function defined in listing 2.

Listing 2: Initialising Blend Copula Functions
function constructor(elements) {
 var max = -1;
 var n, i, j, element, sum;

 if(ak.nativeType(elements)!==ak.ARRAY_T) {
  throw new Error('invalid elements in ak.blendCopula');
 }
 n = elements.length;

 for(i=0;i<n;++i) {
  element = elements[i];
  if(ak.type(element)!==ak.BLEND_COPULA_ELEMENT_T) {
   throw new Error('invalid element in ak.blendCopula');
  }
  max = Math.max(max, element.max());
 }
 ++max;
 sum = new Array(max);
 for(i=0;i<max;++i) sum[i] = 0;

 for(i=0;i<n;++i) {
  element = elements[i];
  for(j=0;j<element.args();++j) sum[element.id(j)] += element.weight(j);
 }
 for(i=0;i<max;++i) if(sum[i]===0 || !isFinite(sum[i])) {
  throw new Error('invalid total weight in ak.blendCopula');
 }

 return {
  elements: elements.slice(0),
  sum: sum
 };
}

This first checks that it is the correct type, calculating the maximum argument index referenced by any of the elements as it does so. It then calculates the sum of the weights for each argument index, which we shall use to normalise them to valid powers to raise the elements at those indices to, throwing an error if any of them are zero or not finite.

The first of these functions is ak.blendCopula, defined in listing 3.

Listing 3: The Blend Copula
function copulaTerm(element, sum, u) {
 var n = element.args();
 var v = new Array(n);
 var i, j;

 for(i=0;i<n;++i) {
  j = element.id(i);
  v[i] = Math.pow(Math.max(0, Math.min(1, u[j])), element.weight(i)/sum[j]);
 }
 return element.copula()(ak.vector(v));
}

function copula(elements, sum, u) {
 var n = elements.length;
 var c = 1;
 var i;

 for(i=0;i<n;++i) c *= copulaTerm(elements[i], sum, u);
 return c;
}

ak.blendCopula = function(elements) {
 var state = constructor(elements);
 var f = function(u) {
  if(ak.type(u)!==ak.VECTOR_T) {
   throw new Error('invalid argument in ak.blendCopula');
  }
  if(u.dims()!==state.sum.length) {
   throw new Error('argument size mismatch in ak.blendCopula');
  }
  return copula(state.elements, state.sum, u.toArray());
 };
 f.elements = function() {return state.elements.slice(0);};
 return Object.freeze(f);
};

Here the copulaTerm helper function calculates the value of a component copula by constructing an ak.vector from its referenced elements of u raised to their normalised weights. The copula function calculates their product and is called by the function returned from ak.blendCopula after checking that its argument is an ak.vector of the correct length, to which is added an accessor method for the elements.
Program 4 uses this to calculate the values of a blend of a three dimensional Gaussian copula and a pair of two dimensional Gumbel copulas at each of a four by four by four grid of points.

Program 4: A Blend Copula

I don't expect that you'll want to verify these result manually, but you can at least check for yourself that the boundary conditions are satisfied and that
\[ u^\prime_0 \geqslant u_0 \, \wedge \, u^\prime_1 \geqslant u_1 \, \wedge \, u^\prime_2 \geqslant u_2 \rightarrow C\left(u^\prime_0, u^\prime_1, u^\prime_2\right) \geqslant C\left(u_0, u_1, u_2\right) \]
where \(\wedge\) means and and \(\rightarrow\) means implies, which must be the case if \(C\) is a copula.

Secondly, and lastly, we come to the density of blend copulas which, as noted above, we must resort to approximating using finite differences. The ak.blendCopulaDensity function defined in listing 4 uses the density helper function to recursively calculate the \(n\) dimensional symmetric finite difference.

Listing 4: The Blend Copula Density
function density(elements, sum, delta, u, n) {
 var un = u[n];
 var u0 = un - delta;
 var u1 = un + delta;
 var f0, f1;

 u[n] = u0;
 f0 = n===0 ? copula(elements, sum, u) : density(elements, sum, delta, u, n-1);
 u[n] = u1;
 f1 = n===0 ? copula(elements, sum, u) : density(elements, sum, delta, u, n-1);
 u[n] = un;

 return Math.max(0, (f1-f0)/(u1-u0));
}

ak.blendCopulaDensity = function(elements, delta) {
 var state = constructor(elements);
 var n = state.sum.length;
 var f;

 if(ak.nativeType(delta)===ak.UNDEFINED_T) {
  delta = 0.5*Math.pow(n*ak.EPSILON, 1/(n+2));
 }
 else if(ak.nativeType(delta)!==ak.NUMBER_T) {
  throw new Error('non-numeric delta in ak.blendCopulaDensity');
 }
 if(!(delta>0)) throw new Error('non-positive delta in ak.blendCopulaDensity');

 f = function(u) {
  if(ak.type(u)!==ak.VECTOR_T) {
   throw new Error('invalid argument in ak.blendCopulaDensity');
  }
  if(u.dims()!==n) {
   throw new Error('argument size mismatch in ak.blendCopulaDensity');
  }
  return density(state.elements, state.sum, delta, u.toArray(), n-1);
 };
 f.elements = function() {return state.elements.slice(0);};
 return Object.freeze(f);
};

The returned function is much the same as that returned by ak.blendCopula but this time we also need a reasonable value for the difference \(\delta\). If defined by the user we require it to be a positive number, otherwise giving it a default value of
\[ \delta = \tfrac12 \times (n \times \epsilon)^{\dfrac1{n+2}} \]
where \(\epsilon\) is the floating point epsilon, being the difference between one and the smallest floating point number that is greater than one, and \(n\) is the number of dimensions, which is justified by derivation 5.

Derivation 5: The Default Difference
Defining
\[ f_n(\mathbf{x}) = \begin{cases} \dfrac{\partial f(\mathbf{x})}{\partial x_0} & n = 0\\ \dfrac{\partial f_{n-1}(\mathbf{x})}{\partial x_n} & \text{otherwise} \end{cases} \]
and
\[ \Delta_n f(\mathbf{x}) = \begin{cases} \dfrac{f(\mathbf{x + \delta \times \mathbf{v}_0}) - f(\mathbf{x - \delta \times \mathbf{v}_0})}{2 \times \delta} & n = 0\\ \dfrac{\Delta_{n-1} f(\mathbf{x + \delta \times \mathbf{v}_n}) - \Delta_{n-1}f(\mathbf{x - \delta \times \mathbf{v}_n})}{2 \times \delta} & \text{otherwise} \end{cases} \]
where \(\mathbf{v}_n\) is a vector with every element apart from the \(n^\mathrm{th}\) equal to zero and the \(n^\mathrm{th}\) equal to one.
The numerical errors in \(\Delta_n\) accumulate multiplicatively and its truncated Taylor series approximation differences from \(f_n\) accumulate additively and so, by the argument that we made for univariate symmetric finite differences[5], we have
\[ \begin{align*} \Delta_n f(\mathbf{x}) - f_n(\mathbf{x}) &\approx a \times \left(\frac{\left(1 \pm \tfrac12 \times \epsilon\right)^{2 \times n} - 1}{(2 \times \delta)^n}\right) + b \times n \times \delta^2\\ &\approx a \times \left(\frac{1 \pm n \times \epsilon - 1}{(2 \times \delta)^n}\right) + b \times n \times \delta^2 \end{align*} \]
for some \(a\) and \(b\). A not unreasonable choice for \(\delta\) is therefore one that minimises
\[ \frac{n \times \epsilon}{(2\times\delta)^n} + n \times \delta^2 \]
Taking the derivative with respect to \(\delta\) and setting it to zero yields
\[ \begin{align*} 0 &= -\frac{n^2}{2^n} \times \epsilon \times \delta^{-n-1} + 2 \times n \times \delta\\ \delta^{n+2} &= \frac{n \times \epsilon}{2^{n+1}}\\ \delta &\approx \tfrac12 \times (n \times \epsilon)^{\dfrac1{n+2}} \end{align*} \]

Program 5 demonstrates its use for the same blend copula and grid of points as program 4.

Program 5: A Blend Copula Density

It should be noted that this approximation of the density is not particularly accurate when elements of its argument are very close to zero or one and the true value is large, and also that it will universally lose accuracy as the number of dimensions increases. That admitted, you can find these functions in BlendCopula.js.

References

[1] Mixing It Up, www.thusspakeak.com, 2021.

[2] Copulating Normally, www.thusspakeak.com, 2017.

[3] Archimedean Crew, www.thusspakeak.com, 2019.

[4] Mazo, G., Girard, S & Forbes, F., A class of multivariate copulas based on products of bivariate copulas, Journal of Multivariate Analysis, Vol. 140, 2015.

[5] You’re Going To Have To Think! Why Finite Differences Won’t Cure Your Calculus Blues, www.thusspakeak.com, 2014.

Leave a comment