Archimedean Skew

| 0 Comments

About a year and a half ago[1] we saw how we could use Gaussian copulas to define dependencies between the elements of a vector valued multivariate random variable whose elements, when considered in isolation, were governed by arbitrary cumulative distribution functions, known as marginals. Whilst Gaussian copulas are quite flexible, they can't represent every possible dependency between those elements and in this post we shall take a look at some others defined by the Archimedean family of copulas.

Recall that copulas are multivariate cumulative distribution functions, or CDFs, with standard uniform marginals and since the CDF of any continuous univariate distribution transforms observations of its random variable into observations of a standard uniform random variable, we can define a multivariate CDF \(F\) from a set of univariate CDFs \(F_i\) and a copula \(C\) with
\[ F\left(\mathbf{x}\right) = C\left(\mathbf{u}\right) \]
where the elements of the vector \(\mathbf{u}\) are related to those of the vector \(\mathbf{x}\) by
\[ u_i = F_i\left(x_i\right) \]
We defined the Gaussian copula \(C_\mathbf{P}\) by turning this relationship upon its head with
\[ \begin{align*} C_\mathbf{P}\left(\mathbf{u}\right) &= \Phi_\mathbf{P}\left(\mathbf{z}\right)\\ z_i &= \Phi^{-1}\left(u_i\right) \end{align*} \]
where \(\Phi_\mathbf{P}\) is a multivariate normal CDF having means of zero, standard deviations of one and a correlation matrix of \(\mathbf{P}\) and \(\Phi^{-1}\) is the inverse of the standard univariate normal CDF.

By definition, the CDF of a multivariate distribution measures the probability that each element of an observation of its random variable is less than or equal to the element of its argument at the same index. Specifically, for an \(n\) dimensional random variable \(X\) with a CDF \(F_X\)
\[ \begin{align*} x &\sim X\\ F_X(\mathbf{y}) &= \Pr\left(\bigwedge_{i=0}^{n-1} x_i \leqslant y_i\right) \end{align*} \]
where \(\sim\) means drawn from and \(\wedge\) means and so that it evaluates as true if the expression following it is true for every value of \(i\) from that below it to that above and as false otherwise.
This means that it must be the case that
\[ F_X(\mathbf{y}) - F_X(\mathbf{x}) \geqslant 0 \]
whenever
\[ \bigwedge_{i=0}^{n-1} x_i \leqslant y_i \]
since it represents that probability that an observation of \(X\) will lie within the volume bounded by \(\mathbf{x}\) and \(\mathbf{y}\) and so cannot be negative. Equivalently, its probability density function, or PDF, \(f_X\) cannot be negative and so if we differentiate the CDF with respect to every element of \(\mathbf{x}\) we have
\[ \frac{\mathrm{d}^n}{\mathrm{d} x_0 \dots \mathrm{d} x_{n-1}} F_X(\mathbf{y}) = f_X(\mathbf{y}) \geqslant 0 \]
Furthermore, it must always be the case that if any element of the CDF's argument is less than or equal to the minimum value of that element of the random variable then it must equal its minimum value of zero and if every element but one is greater than or equal to its maximum value then it will be equal to the value of that element's marginal CDF, equalling its maximum value of one when it too is greater than or equal to its maximum.
Copulas must therefore display these properties with standard uniform marginals and, in fact, any function that does so is a copula.

Archimedean Copulas

If a copula \(C\) satisfies the equation
\[ \phi\left(C\left(\mathbf{u}\right)\right) = \min\left(\sum_{i=0}^{n-1} \phi(u_i), \phi(0)\right) \]
where \(\sum\) is the summation sign and \(\phi\) is a continuous function, known as a generator, that maps the interval \([0,1]\) to \([0,\infty]\) and satisfies
\[ \phi(1) = 0\\ x > y \rightarrow \phi(x) < \phi(y)\\ (-1)^k \frac{\mathrm{d}^k}{\mathrm{d}t^k} \phi^{-1}(t) \geqslant 0 \]
for real \(x\) and \(y\) between zero and one, real non-negative \(t\) and integer \(k\) from zero to \(n\), where \(\rightarrow\) means implies, then it is known as an Archimedean copula which we can trivially define with
\[ C\left(\mathbf{u}\right) = \phi^{-1}\left(\min\left(\sum_{i=0}^{n-1} \phi(u_i), \phi(0)\right)\right) \]
as proven by derivation 1.

Derivation 1: Cφ Is A Copula
Firstly, from the properties of \(\phi\) it follows that both it and \(\phi^{-1}\) must be non-negative and strictly decreasing for arguments within \([0,1]\) and \([0,\phi(0)]\) respectively. From the first of these observations we may conclude that if any element of \(\mathbf{u}\) equals zero the sum in the definition of \(C\) must be greater than or equal to \(\phi(0)\) so that
\[ C\left(\mathbf{u}\right) = \phi^{-1}\left(\phi(0)\right) = 0 \]
and from the second that it must also be its minimum value.

Secondly, if every element but \(u_i\) equals one then every term in the sum except \(\phi\left(u_i\right)\) will equal zero so that
\[ C\left(\mathbf{u}\right) = \phi^{-1}\left(\phi\left(u_i\right)\right) = u_i \]
which will take a maximum value of one when \(u_i\) equals one.

Thirdly, if a multivariate function \(f\) is non-negative for every argument \(\mathbf{x}\) then its integral with respect to any element \(x_i\) from a lower bound \(a\) to an upper bound \(b\)
\[ g\left(\mathbf{x}^\prime\right) = \int_a^b f(\mathbf{x}) \mathrm{d}x_i \]
must also be everywhere non-negative, with the special cases of a bivariate \(f\) yielding an everywhere non-negative univariate \(g\) and a univariate \(f\) yielding a non-negative scalar.
The repeated integration of a multivariate PDF \(f_X\) over a subset of its elements from their lower bounds is identical to the derivative of its CDF \(F_X\) with respect to the elements that are not in that subset with those that are set to the upper bounds of their integrals so that for any set of indices \(I\)
\[ \frac{\mathrm{d}^{|I|}}{\prod_{i \in I} \mathrm{d} x_i} F_X(\mathbf{x}) \geqslant 0 \]
for all \(\mathbf{x}\), where \(|I|\) is the number of indices in \(I\), \(\prod\) is the product sign and \(\in\) means within.
If the sum in \(C\) equals or exceeds \(\phi(0)\) then it's constant and its derivative with respect to any element of \(\mathbf{u}\) is trivially equal to zero. If not, by the repeated application of the chain rule, we have
\[ \frac{\mathrm{d}^{|I|}}{\prod_{i \in I} \mathrm{d} u_i} C\left(\mathbf{u}\right) = \frac{\mathrm{d}^{|I|}}{\mathrm{d}t^{|I|}} \phi^{-1}\left(\sum_{i=0}^{n-1} \phi(u_i)\right) \times \prod_{i \in I} \frac{\mathrm{d}}{\mathrm{d}u} \phi(u_i) \]
Since \(\phi\) is strictly decreasing its derivative must be negative and so \(C\) will satisfy the requirements of a multivariate CDF if the \(k^\mathrm{th}\) derivative of \(\phi^{-1}\) is positive when \(k\) is even and negative when it's odd so that
\[ (-1)^k \frac{\mathrm{d}^k}{\mathrm{d}t^k} \phi^{-1}(t) \geqslant 0 \]
for \(k\) from one to \(n\).

Finally, noting that the zeroth derivative of a function is the function itself and that \(\phi^{-1}\) maps \(\left[0,\phi(0)\right]\) to \([0,1]\), it must also be true when \(k\) equals zero.

Now I must confess that the conditions that the generator \(\phi\) must satisfy as stated above are, in fact, slightly stricter than are absolutely necessary, but for simplicity's sake we'll stick with them.

Implementation Details

Given the simplicity of the formula for Archimedean copulas it should come as no surprise that the implementation of ak.archimedeanCopula, given in listing 1, is relatively trivial.

Listing 1: ak.archimedeanCopula
ak.archimedeanCopula = function(n, p, q) {
 var p0, f;

 if(ak.nativeType(n)!==ak.NUMBER_T || !isFinite(n) || n!==ak.floor(n) || n<0) {
  throw new Error('invalid dimension in ak.archimedeanCopula');
 }

 if(ak.nativeType(p)!==ak.FUNCTION_T) {
  throw new Error('invalid generator in ak.archimedeanCopula');
 }

 if(ak.nativeType(q)!==ak.FUNCTION_T) {
  throw new Error('invalid generator inverse in ak.archimedeanCopula');
 }

 p0 = p(0);

 f = function(u) {
  var t = 0;
  var i;

  if(ak.type(u)!==ak.VECTOR_T || u.dims()!==n) {
   throw new Error('invalid argument in ak.ArchimedeanCopula');
  }

  for(i=0;i<n;++i) t += p(Math.max(Math.min(u.at(i), 1), 0));
  return t>=p0 ? 0 : Math.max(Math.min(q(t), 1), 0);
 };

 f.dims = function() {return n;};
 f.p = function() {return p;};
 f.q = function() {return q;};

 return Object.freeze(f);
};

This is in no small part due to the fact that we're requiring that the user provides valid implementations of both the generator \(\phi\) and its inverse \(\phi^{-1}\) with the functions p and q respectively. We do at least check that they are both functions but leave it to the user to ensure that q is the inverse of p and that they both satisfy the properties described above, resorting to the programmer's Get Out of Jail Free card of undefined behaviour if they fail to do so.
The function that it returns, after adding methods to return the dimension, the generator and its inverse, first checks that its argument u is an n dimensional ak.vector[2] and then simply calculates the result of the formula for an Archimedean copula as is also described above, restricting both the elements of u and the result of the calculation to the range zero to one and taking the slight shortcut of noting that \(\phi^{-1}(\phi(0))\) must equal zero.

To demonstrate ak.archimedeanCopula we need a valid generator and inverse and program 1 uses
\[ \begin{align*} \phi_\theta(u) &= (-\ln u)^\theta\\ \phi_\theta^{-1}(t) &= e^{-t^{\frac{1}{\theta}}} \end{align*} \]
which I must ask you to trust me will yield a copula whenever \(\theta\) is greater than or equal to one.

Program 1: ak.archimedeanCopula

The density \(c\) of a copula \(C\) is equal to its derivative with respect to every element of its argument \(\mathbf{u}\), which for an Archimedean copula will yield
\[ c(\mathbf{u}) = \begin{cases} \frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{-1}\left(\sum_{i=0}^{n-1} \phi(u_i)\right) \times \prod_{i=0}^{n-1} \frac{\mathrm{d}}{\mathrm{d}u} \phi(u_i) & \sum_{i=0}^{n-1} \phi(u_i) < \phi(0)\\ 0 & \mathrm{otherwise} \end{cases} \]
by the same repeated application of the chain rule of differentiation that we used in derivation 1.

From an implementation perspective we shall require the user to supply functions to calculate \(\phi\), its derivative and the \(n^\mathrm{th}\) derivative of its inverse, leaving it to them to make sure that they satisfy their required properties.
In listing 2, the ak.archimedeanCopulaDensity function simply checks that the number of dimensions n is a finite, non-negative integer and that they, passed as p, dp and dnq respectively, are functions and returns a function that uses them to calculate the density, restricting values to their valid ranges.

Listing 2: ak.archimedeanCopulaDensity
ak.archimedeanCopulaDensity = function(n, p, dp, dnq) {
 var p0, f;

 if(ak.nativeType(n)!==ak.NUMBER_T || !isFinite(n) || n!==ak.floor(n) || n<0) {
  throw new Error('invalid dimension in ak.archimedeanCopulaDensity');
 }

 if(ak.nativeType(p)!==ak.FUNCTION_T) {
  throw new Error('invalid generator in ak.archimedeanCopulaDensity');
 }

 if(ak.nativeType(dp)!==ak.FUNCTION_T) {
  throw new Error('invalid generator derivative in ak.archimedeanCopulaDensity');
 }

 if(ak.nativeType(dnq)!==ak.FUNCTION_T) {
  throw new Error('invalid generator inverse derivative in ak.archimedeanCopulaDensity');
 }

 p0 = p(0);

 f = function(u) {
  var d = 1;
  var s = 0;
  var i, ui, dq;

  if(ak.type(u)!==ak.VECTOR_T || u.dims()!==n) {
   throw new Error('invalid argument in ak.archimedeanCopulaDensity');
  }

  for(i=0;i<n;++i) {
   ui = Math.max(Math.min(u.at(i), 1), 0);
   d *= dp(ui);
   s += p(ui);
  }
  if(s>=p0) return 0;
  dq = dnq(s, n);
  if(ak.type(dq)===ak.SURREAL_T) dq = dq.deriv(n);
  return Math.max(dq*d, 0);
 };

 f.dims = function() {return n;};
 f.p = function() {return p;};
 f.dp = function() {return dp;};
 f.dnq = function() {return dnq;};

 return Object.freeze(f);
};

As before, the simplicity of this translation of the formula into code rests upon the user doing all of the hard work. In particular, the dnq function to calculate the \(n^\mathrm{th}\) derivative of the inverse of the generator can be tricky to get right. To make things easier for them we're allowing the dnq function to use our ak.surreal[3] type to calculate the derivatives automatically using our system of overloading arithmetic operations[4], as exploited by program 2.

Program 2: ak.archimedeanCopulaDensity

That implementing the generator for an Archimedean copula is a relatively simple job but implementing the \(n^\mathrm{th}\) derivative of the generator's inverse for its density is far more complicated stands in stark contrast to Gaussian copulas for which the densities are trivial to implement but the copulas themselves require sophisticated numerical approximation algorithms.
This inversion of complexity becomes even more apparent when it comes to drawing random vectors from Archimedean copulas, as we shall see in the next post.

References

[1] Copulating Normally, www.thusspakeak.com, 2017.

[2] What's Our Vector, Victor?, www.thusspakeak.com, 2014.

[3] You're Going To Have To Think! Why Automatic Differentiation Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.

[4] Climbing Mount Impractical, www.thusspakeak.com, 2013.

Leave a comment