Last time^{[1]} we took a look at how we could define copulas to represent the dependency between random variables by summing the results of a generator function \(\phi\) applied to the results of their cumulative distribution functions, or CDFs, and then applying the inverse of that function \(\phi^{1}\) to that sum.
These are known as Archimedean copulas and are valid whenever \(\phi\) is strictly decreasing over the interval \([0,1]\), equal to zero when its argument equals one and have \(n^\mathrm{th}\) derivatives that are nonnegative over that interval when \(n\) is even and nonpositive when it is odd, for \(n\) up to the number of random variables.
Whilst such copulas are relatively easy to implement we saw that their densities are a rather trickier job, in contrast to Gaussian copulas where the reverse is true^{[2]}. In this post we shall see how to draw random vectors from Archimedean copulas which is also much more difficult than doing so from Gaussian copulas.
Recall that copulas are effectively multivariate distributions whose vector observations' elements, when considered in isolation, have standard uniform distributions. We can transform its observations into those of a general multivariate distribution by applying the inverse of each element's desired standalone, or marginal, CDF to their elements, which works because any continuous random variable is transformed into a standard uniform random variable by its CDF.
For a Gaussian copula this means that we can make an \(n\) dimensional observation \(\mathbf{x}\) with
As described above, \(n\) dimensional Archimedean copulas are defined by
Their densities are defined by their derivatives with respect to every element of their vector arguments which, by the repeated application of the chain rule, equal
Now it's not entirely obvious how we might use these definitions to make observations of Archimedean copulas, but fortunately there's a rather neat trick that we can use^{[5]}.
Since the derivative of an integral with respect to its lower bound is equal to minus one times the function being integrated, known as the integrand, at the value of that lower bound, we have
The integral
Note that this is exploiting the fact that the \(n^\mathrm{th}\) coefficient of a
Since the \(i^\mathrm{th}\) coefficient of our
Now we can trivially make observations of \(v^\prime_0\) by applying this inverse to observations of a standard uniform random variable.
Now, since \(f^\prime\left(\mathbf{v}^\prime\right)\) doesn't depend upon \(v^\prime_1\) to \(v^\prime_{n1}\) it must be the case that their joint conditional PDF for any given value of \(v^\prime_0\) is constant within the set \(V^\ast\) and that they must therefore be uniformly distributed over it. Furthermore, given that \(v^\prime_i\) equals \(v_i\) for nonzero \(i\), the latter must be uniformly distributed within their simplex.
We can uniformly sample such simplices by making \(n1\) independent observations of a uniform random variable with a lower bound of zero and an upper bound of \(v^\prime_0\), labelling the \(i^\mathrm{th}\) largest with \(x_i\), counting from \(i\) equal to zero, and defining
Program 5 first plots independent pairs of standard uniform random values generated by
Now I must admit that this isn't the most efficient or numerically stable implementation of this algorithm and so when we come to add it to the
[2] Copulating Normally, www.thusspakeak.com, 2017.
[3] You've Got Class, Mr Cholesky!, www.thusspakeak.com, 2015.
[4] Define Normal, www.thusspakeak.com, 2014.
[5] Whelan, N., Sampling from Archimedean Copulas, Quantitative Finance, Vol. 4, No. 3, Taylor and Francis, 2004.
[6] You're Going To Have To Think! Why Automatic Differentiation Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[7] On The Shoulders Of Gradients, www.thusspakeak.com, 2014.
These are known as Archimedean copulas and are valid whenever \(\phi\) is strictly decreasing over the interval \([0,1]\), equal to zero when its argument equals one and have \(n^\mathrm{th}\) derivatives that are nonnegative over that interval when \(n\) is even and nonpositive when it is odd, for \(n\) up to the number of random variables.
Whilst such copulas are relatively easy to implement we saw that their densities are a rather trickier job, in contrast to Gaussian copulas where the reverse is true^{[2]}. In this post we shall see how to draw random vectors from Archimedean copulas which is also much more difficult than doing so from Gaussian copulas.
Recall that copulas are effectively multivariate distributions whose vector observations' elements, when considered in isolation, have standard uniform distributions. We can transform its observations into those of a general multivariate distribution by applying the inverse of each element's desired standalone, or marginal, CDF to their elements, which works because any continuous random variable is transformed into a standard uniform random variable by its CDF.
For a Gaussian copula this means that we can make an \(n\) dimensional observation \(\mathbf{x}\) with
\[
\begin{align*}
z_j &\sim N(0,1)\\
z^\prime_i &= \sum_{j=0}^{n1} \mathrm{P}^\frac12_{i,j} \times z_j\\
u_i &= \Phi_{0,1}\left(z^\prime_i\right)\\
x_i &= F^{1}_i\left(u_i\right)
\end{align*}
\]
where \(\sim\) means drawn from, \(N(0,1)\) is the standard normal random variable, \(\sum\) is the summation sign, \(\mathbf{P}^\frac12\) is the matrix square root^{[3]} of a correlation matrix, \(\Phi_{0,1}\) is the standard normal CDF and \(F^{1}_i\) is the inverse of the \(i^\mathrm{th}\) marginal CDF. Whilst the normal CDF can't be expressed algebraically, we already have an accurate numerical approximation of it with ak.normalCDF
^{[4]} so there's nothing particularly troublesome about this process.As described above, \(n\) dimensional Archimedean copulas are defined by
\[
C\left(\mathbf{u}\right) = \phi^{1}\left(\min\left(\sum_{i=0}^{n1} \phi(u_i), \phi(0)\right)\right)
\]
for a generator function \(\phi\) that satisfies
\[
\phi : [0,1] \mapsto [0,\infty]\\
\phi(1) = 0\\
x > y \rightarrow \phi(x) < \phi(y)\\
(1)^k \frac{\mathrm{d}^k}{\mathrm{d}t^k} \phi^{1}(t) \geqslant 0
\]
where \(\mapsto\) means maps to and \(\rightarrow\) means implies, for \(k\) less than or equal to \(n\), albeit acknowledging that these conditions are slightly stricter than necessary.Their densities are defined by their derivatives with respect to every element of their vector arguments which, by the repeated application of the chain rule, equal
\[
c(\mathbf{u}) = \frac{\mathrm{d}^n}{\mathrm{d}u_0 \dots \mathrm{d}u_{n1}} C(\mathbf{u}) =
\begin{cases}
\frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{1}\left(\sum_{i=0}^{n1} \phi(u_i)\right) \times \prod_{i=0}^{n1} \frac{\mathrm{d}}{\mathrm{d}u} \phi(u_i) & \sum_{i=0}^{n1} \phi(u_i) < \phi(0)\\
0 & \mathrm{otherwise}
\end{cases}
\]
where \(\prod\) is the product sign.Now it's not entirely obvious how we might use these definitions to make observations of Archimedean copulas, but fortunately there's a rather neat trick that we can use^{[5]}.
Step The First
We can express \(C\) as a function of a vector \(\mathbf{v}\) with elements \(v_i=\phi\left(u_i\right)\)
\[
C\left(\mathbf{u}\right) = \bar{F}\left(\mathbf{v}\right) = \phi^{1}\left(\min\left(\sum_{i=0}^{n1} v_i, \phi(0)\right)\right)
\]
Since \(\phi\) is strictly decreasing over \([0,1]\), \(\bar{F}\) is the complementary CDF of the distribution of \(\mathbf{v}\), defined as
\[
\bar{F}\left(\mathbf{v}\right)
= \Pr \left(\bigwedge_{i=0}^{n1} V_i > v_i \right)
= \int_{v_0}^\infty \dots \int_{v_{n1}}^\infty f \left(\mathbf{v}\right) \, \mathrm{d}v_0 \dots \mathrm{d}v_{n1}
\]
where \(\mathbf{V}\) is an observation of that distribution, \(f\) is its probability density function, or PDF, and \(\wedge\) means and which evaluates as true if the expression following it is true for every value of the index defined and initialised beneath it up to and including the value above it.Since the derivative of an integral with respect to its lower bound is equal to minus one times the function being integrated, known as the integrand, at the value of that lower bound, we have
\[
f \left(\mathbf{v}\right)
= (1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}v_0 \dots \mathrm{d}v_{n1}} \bar{F}\left(\mathbf{v}\right)
=
\begin{cases}
(1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{1}\left(\sum_{i=0}^{n1} v_i\right) & \sum_{i=0}^{n1} v_i < \phi(0)\\
0 & \mathrm{otherwise}
\end{cases}
\]
Step The Second
If we define a new random vector \(\mathbf{v}^\prime\) with elements
\[
v_i^\prime =
\begin{cases}
\sum_{j=0}^{n1} v_j & i=0\\
v_i & \mathrm{otherwise}
\end{cases}
\]
then it will have the PDF
\[
f^\prime \left(\mathbf{v^\prime}\right)
=
\begin{cases}
(1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{1}\left(v^\prime_0\right) & v^\prime_0 < \phi(0) \wedge \sum_{i=1}^{n1} v^\prime_i \leqslant v^\prime_0\\
0 & \mathrm{otherwise}
\end{cases}
\]
as proven in derivation 1.
Derivation 1: The PDF Of v
We can transform \(\mathbf{v}^\prime\) to \(\mathbf{v}\) with
If this transformation maps a set of points \(V^\prime\) to a set \(V\) then it must be the case that
Finally, if
\[
v_i =
\begin{cases}
v^\prime_0  \sum_{j=1}^{n1} v^\prime_j & i=0\\
v^\prime_i & \mathrm{otherwise}
\end{cases}
\]
This transformation's Jacobian matrix of partial derivatives \(\mathbf{J}\) has the elements
\[
J_{ij} = \frac{\partial v_i}{\partial v^\prime_j} =
\begin{cases}
1 & i=j\\
1 & i=0 \wedge j>0\\
0 & \mathrm{otherwise}
\end{cases}
\]
and is therefore upper triangular, having elements equal to zero whenever \(i\) is greater than \(j\), so that its determinant \(\mathbf{J}\) is the product of the elements on its leading diagonal, where \(i\) equals \(j\), which is trivially equal to one.If this transformation maps a set of points \(V^\prime\) to a set \(V\) then it must be the case that
\[
\Pr\left(\mathbf{v}^\prime \in V^\prime\right) = \Pr\left(\mathbf{v} \in V\right)
\]
where \(\in\) means within, and consequently
\[
\int_{V^\prime} f^\prime\left(\mathbf{v}^\prime\right) \mathrm{d}\mathbf{v}^\prime
= \int_V f\left(\mathbf{v}\right) \mathrm{d}\mathbf{v}
= \int_{V^\prime} f\left(\mathbf{v}\right) \biggl\mathbf{J}\biggr\mathrm{d}\mathbf{v}^\prime
= \int_{V^\prime} f\left(\mathbf{v}\right) \mathrm{d}\mathbf{v}^\prime
\]
by the rule of integration by substitution, implying that \(f^\prime\left(\mathbf{v}^\prime\right)\) equals \(f\left(\mathbf{v}\right)\).Finally, if
\[
\sum_{j=1}^{n1} v^\prime_j > v^\prime_0
\]
then \(v_0\) will be negative which, being outside of its valid range, means that \(f\left(\mathbf{v}\right)\) will equal zero and so
\[
\begin{align*}
f^\prime \left(\mathbf{v^\prime}\right)
&=
\begin{cases}
(1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{1}\left(\sum_{i=0}^{n1} v_i\right) & \sum_{i=0}^{n1} v_i < \phi(0) \wedge \sum_{i=1}^{n1} v^\prime_i \leqslant v^\prime_0\\
0 & \mathrm{otherwise}
\end{cases}\\
&=
\begin{cases}
(1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n} \phi^{1}\left(v^\prime_0\right) & v^\prime_0 < \phi(0) \wedge \sum_{i=1}^{n1} v^\prime_i \leqslant v^\prime_0\\
0 & \mathrm{otherwise}
\end{cases}
\end{align*}
\]
Step The Third
We can calculate the marginal PDF of \(v^\prime_0\) by integrating \(f^\prime\) over \(v^\prime_1\) to \(v^\prime_{n1}\) within a set of points \(V^\ast\) defined by
\[
\mathbf{v}^\prime \in V^\ast \rightarrow \sum_{i=1}^{n1} v^\prime_i \leqslant v^\prime_0
\]
to yield
\[
\begin{align*}
f^\ast\left(v^\prime_0\right)
&= \int_{V^\ast} f^\prime(\mathbf{v}^\prime) \, \mathrm{d}v^\prime_1 \dots \mathrm{d}v^\prime_{n1}\\
&=
\begin{cases}
\int_{V^\ast} (1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n}\phi^{1}\left(v^\prime_0\right) \, \mathrm{d}v^\prime_1 \dots \mathrm{d}v^\prime_{n1} & v^\prime_0 < \phi(0)\\
0 & \mathrm{otherwise}
\end{cases}\\
&=
\begin{cases}
(1)^n \times \frac{\mathrm{d}^n}{\mathrm{d}t^n}\phi^{1}\left(v^\prime_0\right) \times \int_{V^\ast} 1 \, \mathrm{d}v^\prime_1 \dots \mathrm{d}v^\prime_{n1} & v^\prime_0 < \phi(0)\\
0 & \mathrm{otherwise}
\end{cases}
\end{align*}
\]
since any expression that doesn't depend upon variables being integrated over can be treated as a constant and moved outside of an integral.The integral
\[
\int_{V^\ast} 1 \, \mathrm{d}v^\prime_1 \dots \mathrm{d}v^\prime_{n1}
\]
is simply the volume of a multidimensional triangle, or simplex, with vertices at the origin and at \(v^\prime_0\) on each of the axes of \(v^\prime_1\) to \(v^\prime_{n1}\) which we can calculate with
\[
\int_0^{v^\prime_0} \int_0^{v^\prime_1} \dots \int_0^{v^\prime_n2} 1 \, \mathrm{d}v^\prime_1 \dots \mathrm{d}v^\prime_{n1}
= \frac{{v^\prime_0}^{n1}}{(n1)!}
\]
where the exclamation mark stands for the factorial, which is the product of every integer from one to the value to its left, and the PDF of the marginal distribution of \(v^\prime_0\) is therefore
\[
f^\ast\left(v^\prime_0\right)
=
\begin{cases}
(1)^n \times \frac{{v^\prime_0}^{n1}}{(n1)!} \times \frac{\mathrm{d}^n}{\mathrm{d}t^n}\phi^{1}\left(v^\prime_0\right) & v^\prime_0 < \phi(0)\\
0 & \mathrm{otherwise}
\end{cases}
\]
Program 1 plots this PDF for the generator
\[
\phi_\theta(u) = (\ln u)^\theta\\
\]
which is valid for \(\theta\) greater than or equal to one and has an inverse of
\[
\phi_\theta^{1}(t) = e^{t^{\frac{1}{\theta}}}
\]
using our ak.surreal
type^{[6]} to automatically calculate its derivatives.
Program 1: The Marginal PDF



Note that this is exploiting the fact that the \(n^\mathrm{th}\) coefficient of a
surreal
number is equal to the \(n^\mathrm{th}\) derivative divided by \(n!\) to slightly simplify the calculation with the equivalence
\[
\frac{{v^\prime_0}^{n1}}{(n1)!} \times \frac{\mathrm{d}^n}{\mathrm{d}t^n}\phi^{1}\left(v^\prime_0\right)
=
{v^\prime_0}^{n1} \times n \times \frac{\frac{\mathrm{d}^n}{\mathrm{d}t^n}\phi^{1}\left(v^\prime_0\right)}{n!}
\]
for \(n\) greater than zero.
Step The Fourth
To find the CDF of \(v^\prime_0\) we simply need to integrate its PDF, which results in
\[
F^\ast\left(v^\prime_0\right)
= \int_0^{v^\prime_0} f^\ast(t) \, \mathrm{d}t
=
\begin{cases}
0 & v^\prime_0 \leqslant 0\\
\sum_{i=1}^n \frac{(1)^{n+i1}}{(ni)!} \times \left[t^{ni} \frac{\mathrm{d}^{ni}}{\mathrm{d}t^{ni}}\phi^{1}(t)\right]_0^{{v^\prime_0}} & 0 < v^\prime_0 < \phi(0)\\
1 & \phi(0) \leqslant v^\prime_0
\end{cases}
\]
as proven by derivation 2.
Derivation 2: The CDF Of v
To integrate a function of the form
\[
f_n(x) = x^{n1} \times \frac{\mathrm{d}^n}{\mathrm{d}x^n} g(x)
\]
we can employ integration by parts to yield
\[
\begin{align*}
\int_a^b f_n(x) \, \mathrm{d}x
&= \left[x^{n1} \times \frac{\mathrm{d}^{n1}}{\mathrm{d}x^{n1}} g(x)\right]_a^b  (n1) \times \int_a^b x^{n2} \times \frac{\mathrm{d}^{n1}}{\mathrm{d}x^{n1}} g(x) \, \mathrm{d}x\\
&= \left[x^{n1} \times \frac{\mathrm{d}^{n1}}{\mathrm{d}x^{n1}} g(x)\right]_a^b  (n1) \times \int_a^b f_{n1}(x) \, \mathrm{d}x
\end{align*}
\]
We also have
\[
\int_a^b f_1(x) \, \mathrm{d}x
= \int_a^b x^0 \times \frac{\mathrm{d}}{\mathrm{d}x}g(x) \, \mathrm{d}x
= \int_a^b \frac{\mathrm{d}}{\mathrm{d}x}g(x) \, \mathrm{d}x
= \left[g(x)\right]_a^b
\]
and so by induction
\[
\begin{align*}
\int_a^b f_n(x) \, \mathrm{d}x
&= \sum_{i=1}^n (1)^{i1} \times \frac{(n1)!}{((n1)(i1))!} \times \left[x^{ni} \frac{\mathrm{d}^{ni}}{\mathrm{d}x^{ni}}g(x)\right]_a^b\\
&= \sum_{i=1}^n (1)^{i1} \times \frac{(n1)!}{(ni)!} \times \left[x^{ni} \frac{\mathrm{d}^{ni}}{\mathrm{d}x^{ni}}g(x)\right]_a^b
\end{align*}
\]
from which the formula for the CDF trivially follows.
Since the \(i^\mathrm{th}\) coefficient of our
ak.surreal
type is the \(i^\mathrm{th}\) derivative divided by \(i!\) we can use it to calculate every such term in the CDF's sum at once, as is done in program 2 which also uses the facts that \(2n1\) is definitely an odd number, \(0!\) equals one and
\[
\left[t^0 \frac{\mathrm{d}^0}{\mathrm{d}t^0}\phi^{1}(t)\right]_0^{v^\prime_0}
= \left[\phi^{1}(t)\right]_0^{v^\prime_0}
= \phi^{1}(v^\prime_0)  \phi^{1}(0)
= \phi^{1}(v^\prime_0)  1
\]
to calculate its final term.
Program 2: The Marginal CDF



Step The Fifth
The next thing that we need to do is calculate the CDF's inverse \({F^\ast}^{1}\) which for
\[
u^\prime_0 = F^\ast\left(v^\prime_0\right)
\]
equals
\[
{F^\ast}^{1}\left(u^\prime_0\right) = v^\prime_0
\]
Unfortunately there's no general formula for this inverse and so we're going to have to resort to numerical trickery. Since the PDF of \(v^\prime_0\) is trivially equal to the derivative of its CDF and we know how to calculate both, the obvious approach is to use Newton's method^{[7]} to search for a value of \(v\) that is approximately equal to the solution of the equation
\[
F^\ast\left(v\right)  u^\prime_0 = 0
\]
by iteratively updating an initial guess of \(v_0\) with
\[
v_{i+1} = v_i  \frac{F^\ast\left(v_i\right)  u^\prime_0}{f^\ast\left(v_i\right)}
\]
until we reach an index \(n\) at which \(F^\ast\left(v_n\right)\) is sufficiently close to \(u^\prime_0\), as demonstrated by program 3.
Program 3: The Marginal Inverse CDF



Now we can trivially make observations of \(v^\prime_0\) by applying this inverse to observations of a standard uniform random variable.
Step The Sixth
After we've made an observation of \(v^\prime_0\) we need to randomly divide it up amongst the \(v_j\) so that
\[
\sum_{j=0}^{n1} v_j = v^\prime_0
\]
or, in other words, that
\[
\begin{align*}
\sum_{j=1}^{n1} v_j &\leqslant v^\prime_0\\
v_0 &= v^\prime_0  \sum_{j=1}^{n1} v_j
\end{align*}
\]
Just as we noted in the third step for \(v^\prime_1\) to \(v^\prime_{n1}\), this means that \(v_1\) to \(v_{n1}\) must lie within an \(n1\) dimensional simplex with vertices at the origin and at \(v^\prime_0\) on each of their axes.Now, since \(f^\prime\left(\mathbf{v}^\prime\right)\) doesn't depend upon \(v^\prime_1\) to \(v^\prime_{n1}\) it must be the case that their joint conditional PDF for any given value of \(v^\prime_0\) is constant within the set \(V^\ast\) and that they must therefore be uniformly distributed over it. Furthermore, given that \(v^\prime_i\) equals \(v_i\) for nonzero \(i\), the latter must be uniformly distributed within their simplex.
We can uniformly sample such simplices by making \(n1\) independent observations of a uniform random variable with a lower bound of zero and an upper bound of \(v^\prime_0\), labelling the \(i^\mathrm{th}\) largest with \(x_i\), counting from \(i\) equal to zero, and defining
\[
v_i =
\begin{cases}
x_i  x_{i1} & 1 \leqslant i < n1\\
v^\prime_0  x_{i1} & i = n1
\end{cases}
\]
Program 4 demonstrates that this scheme uniformly fills a triangle with vertices at \((0,0)\), \((0,1)\) and \((1,0)\).
Program 4: Uniformly Sampling A Triangle



Program 5 first plots independent pairs of standard uniform random values generated by
Math.random
with the tenth of each plotted beneath and to the left of the graph and then does the same for pairs of values drawn from a two dimensional Archimedean copula with the generator specified above, noting that dividing an observation of \(v^\prime_0\) in two dimensions simplifies to
\[
\begin{align*}
u &\sim U(0,1)\\
v_1 &= u \times v^\prime_0\\
v_0 &= v^\prime_0  v_1
\end{align*}
\]
where \(U(0,1)\) is the standard uniform distribution.
Program 5: Sampling An Archimedean Copula



Now I must admit that this isn't the most efficient or numerically stable implementation of this algorithm and so when we come to add it to the
ak
library in the next post we shall do so with an eye to fixing these deficiencies.
References
[1] Archimedean Skew, www.thusspakeak.com, 2018.[2] Copulating Normally, www.thusspakeak.com, 2017.
[3] You've Got Class, Mr Cholesky!, www.thusspakeak.com, 2015.
[4] Define Normal, www.thusspakeak.com, 2014.
[5] Whelan, N., Sampling from Archimedean Copulas, Quantitative Finance, Vol. 4, No. 3, Taylor and Francis, 2004.
[6] You're Going To Have To Think! Why Automatic Differentiation Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[7] On The Shoulders Of Gradients, www.thusspakeak.com, 2014.
Leave a comment