A Measure Of Borel Weight

| 0 Comments

In the last few posts[1][2] we have implemented a type to represent Borel sets of the real numbers, which are the subsets of them that can be created with countable unions of intervals with closed or open lower and upper bounds. Whilst I would argue that doing so was a worthwhile exercise in its own right, you may be forgiven for wondering what Borel sets are actually for and so in this post I shall try to justify the effort that we have spent on them.

The Outer Limits

For discrete sets, such as those that we have represented with sorted arrays[3] , we can define their sizes as the number of elements that they contain. Borel sets typically contain an infinite number of elements and so this definition isn't particularly useful. Instead, we can use the outer measure which can be formally defined for a Borel set \(A\) as
\[ m^\ast(A) = \inf Z_A \]
where \(\inf\) is the infinum of the set that follows it, being its smallest member or, more accurately, the largest number that is no greater any of them, and
\[ Z_A = \left\{\sum_{i=1}^\infty \left(b_i-a_i\right) : a_i \leqslant b_i \wedge A \subseteq \bigcup_{i=1}^\infty \left(a_i,b_i\right)\right\} \]
where \(\sum\) is the summation sign, \(\wedge\) means and, \(\subseteq\) means is a subset of and \(\bigcup\) stands for the union of the open intervals that follow it for every value of \(i\) from that beneath it to that above. Note that the terms between the curly braces are a form of set notation that stand for every possible value of the expression on the left of the colon that satisfies the condition on its right and that this condition can trivially represent the union of a finite number of \(n\) non-empty intervals by setting \(a_i\) equal to \(b_i\) for all \(i\) greater than \(n\).

Derivation 1: The Closed Limit Of Open Intervals
Firstly, for all positive \(n\), we have
\[ a \in \left(a-\frac1n,a+\frac1n\right) \]
where \(\in\) means within. Secondly, we have
\[ \begin{align*} &\forall\epsilon>0 , \forall n>\frac1\epsilon , a-\epsilon < a-\frac1n\\ &\forall\epsilon>0 , \forall n>\frac1\epsilon , a+\epsilon > a+\frac1n \end{align*} \]
where \(\forall\) is the universal quantifier, meaning for all. Finally, by the formal definition of limits this means that for any non-zero \(\epsilon\)
\[ a\pm\epsilon \notin \lim_{n\to\infty}\left(a-\frac1n,a+\frac1n\right) \]
where \(\notin\) means not within, and consequently
\[ \lim_{n\to\infty}\left(a-\frac1n,a+\frac1n\right) = [a,a] \]
as required.

Derivation 2: The Infinum Is Disjoint
Given the numbers \(a < b < c < d\) we have
\[ \begin{align*} &(a, c) \cup (b, d) = (b, c) \cup (a, d) = (a, d)\\ &(a, c) \cup (a, d) = (b, d) \cup (a, d) = (a, d) \end{align*} \]
where \(\cup\) is the union operator, and
\[ \begin{align*} (c-a) + (d-b) &= (c-b) + (d-a) > (d-a)\\ (c-a) + (d-a) &> (d-a)\\ (d-b) + (d-a) &> (d-a) \end{align*} \]
We can consequently reduce the value of the sum defining a member of \(Z_A\) whilst satisfying its condition by replacing any such pair of overlapping intervals with \((a, d)\). Its infinum cannot, therefore, include any overlapping non-empty intervals.
What this essentially means is that the measure of \(A\) is equal to the sum of the widths of the set of open intervals that most closely covers its elements. Now it may not be entirely clear what this means for Borel sets that contain intervals with closed bounds but if we note that we can represent a closed interval containing a single number as the limit of a sequence of open intervals
\[ \lim_{n\to\infty} \left(a-\frac1n,a+\frac1n\right) = \left[a,a\right] \]
as proven in derivation 1, then we can close an open bound at \(a\) with the limit of each of the unions of it with these open intervals. Since they all satisfy the condition of the set that we used to define the outer measure, we can consider these limits to be members of it too and so the measure also applies to Borel sets containing intervals with closed bounds.
Now the widths of these closed intervals of single numbers are zero since
\[ \lim_{n\to\infty} \left(a+\frac1n\right)-\left(a-\frac1n\right) = \lim_{n\to\infty} \frac2n = 0 \]
and so we can safely say that the open interval that most closely covers an interval with a closed bound is the one whose bounds have the same values, despite it not entirely doing so. Note that, since this holds for all finite values of \(a\), we can use another limit argument to show that it holds for infinite values of \(a\) as well.

We can also assert that the intervals defining the infinum of \(Z_A\) are disjoint, meaning that
\[ i \neq j \rightarrow (a_i, b_i) \cap (a_j, b_j) = \varnothing \]
where \(\rightarrow\) means implies, \(\cap\) is the intersection operator and \(\varnothing\) is the empty set, as proven in derivation 2.
Furthermore, since empty intervals don't affect the value of the outer measure, we can ignore them when calculating its value too.

In fact, we have already implemented the interval width and the set outer measure with ak.abs overloads for our ak.borelInterval and ak.borelSet types, given in listing 1 and listing 2 respectively.

Listing 1: The Interval Width
function abs(x) {
 return x.ub().value()-x.lb().value();
}

ak.overload(ak.abs, ak.BOREL_INTERVAL_T, abs);

Listing 2: The Set Outer Measure
function abs(x) {
 var n = x.intervals();
 var l = 0;
 var i, xi;

 for(i=0;i<n;++i) {
  xi = x.at(i);
  l += xi.ub().value()-xi.lb().value();
 }
 return l;
}

ak.overload(ak.abs, ak.BOREL_SET_T, abs);

Note that since we are already representing Borel sets with minimal unions of intervals we simply need to add up their widths to calculate the outer measure.
Program 1 demonstrates these overloads for a set of intervals and some Borel sets of unions of them.

Program 1: The ak.abs Overloads

Derivation 3: Subadditivity
Given Borel sets \(A\) and \(B\), let \(A^\ast\) and \(B^\ast\) be covering sets that yield the infinums of \(Z_A\) and \(Z_B\) and let us represent them as
\[ \begin{align*} A^\ast &= I_A \cup X_A\\ B^\ast &= I_B \cup X_B \end{align*} \]
where \(I_A\) and \(I_B\) are countable sets of disjoint open intervals and \(X_A\) and \(X_B\) are countable sets of numbers representing any closed bounds in \(A\) and \(B\) not contained within the former respectively.
It must therefore be the case that \(A^\ast \cup B^\ast\) covers \(A \cup B\) and so
\[ m^\ast\left(A^\ast\right) + m^\ast\left(B^\ast\right) \in Z_{A \cup B} \]
which means that
\[ m^\ast(A \cup B) = \inf Z_{A \cup B} \leqslant m^\ast\left(A^\ast\right) + m^\ast\left(B^\ast\right) \]
This demonstrates one of the basic properties of the outer measure, namely that it is subadditive
\[ m^\ast\left(A \cup B\right) \leqslant m^\ast(A) + m^\ast(B) \]
as proven in derivation 3.
Another, which follows by pretty much the same argument, is that the outer measure of the union of disjoint sets is the sum of their outer measures
\[ A \cap B = \varnothing \rightarrow m^\ast\left(A \cup B\right) = m^\ast(A) + m^\ast(B) \]
from which it also follows that
\[ \begin{align*} m^\ast\left(A \cup B\right) &= m^\ast\left(A - B\right) + m^\ast(B)\\ &= m^\ast\left(B - A\right) + m^\ast(A)\\ &= m^\ast\left(A \ominus B\right) + m^\ast(A \cap B) \end{align*} \]
where the minus sign represents the set difference operator, which yields the set of elements in the left hand term that are not in the right hand term, and \(\ominus\) is the set symmetric difference operator, yielding the set of elements that are in one or the other but not both.

Now the outer measure of every Borel set is non-negative and is only equal to zero for the empty set or countable unions of closed intervals whose upper and lower bounds are equal. Such sets are described as having zero measure and are an important special case.
For example, let us repeatedly remove the open middle thirds of every interval in a Borel set that starts with a single closed interval from zero to one, as demonstrated by program 2 which plots the first six such sets, \(C_0\) to \(C_5\).

Program 2: Removing The Middle Thirds

Derivation 4: The Elements Of Cantor's Set
Firstly, note that in base three we have
\[ \begin{align*} 0 &= 0.0 = 0.0\dot{0} & \tfrac13 &= 0.1 = 0.0\dot{2}\\ \tfrac23 &= 0.2 = 0.2\dot{0} & 1 & = 1.0 = 0.2\dot{2} \end{align*} \]
where the dots mark recurring digits, and so \(C_1\) contains all numbers that can be represented as a base three positional fraction with a zero or a two for the first digit after the point.
Similarly, we have
\[ \begin{align*} 0 &= 0.00\dot{0} & \tfrac19 &= 0.00\dot{2} & \tfrac29 &= 0.02\dot{0} & \tfrac39 &= 0.02\dot{2}\\ \tfrac69 &= 0.20\dot{0} & \tfrac79 &= 0.20\dot{2} & \tfrac89 &= 0.22\dot{0} & 1 &= 0.22\dot{2} \end{align*} \]
and so \(C_2\) contains those numbers having a zero or a two for the first two digits after the point.
Continuing in the same fashion, we find that \(C_n\) contains the numbers whose first \(n\) digits after the point are all zero or two and so \(\mathcal{C}\) must contain every number that only has zeros and twos after the point.
Finally, if we replace the twos with ones and interpret them as binary fractions then we'll have every number between zero and one or, in other words, the elements of \(C_0\).
Each \(C_n\) includes \(2^n\) disjoint closed intervals of width \(\frac1{3^n}\) and so has an outer measure of
\[ m^\ast\left(C_n\right) = 2^n \times \frac1{3^n} = \left(\frac23\right)^n \]
After infinitely many steps we are left with the Cantor set \(\mathcal{C}\), which has an outer measure of
\[ \begin{align*} m^\ast(\mathcal{C}) &= \lim_{n\to\infty} m^\ast\left(C_n\right)\\ &= \lim_{n\to\infty} \left(\frac23\right)^n\\ &= 0 \end{align*} \]
Now this is trivially not the empty set since it must include both zero and one, but what's surprising about it is that it has exactly as many elements as \(C_0\), as proven by derivation 4 which puts them into a one to one correspondence with each other.

Probability Measures

If we have a function \(F_X\) that has the following properties of the outer measure for all Borel sets \(A\) and \(B\)
\[ \begin{align*} F_X(A) &\geqslant 0\\ F_X(\varnothing) &= 0\\ F_X(A \cup B) &\leqslant F_X(A) + F_X(B)\\ A \cap B &= \varnothing \rightarrow F_X(A \cup B) = F_X(A) + F_X(B) \end{align*} \]
and also satisfies
\[ \begin{align*} F_X(A) &\leqslant 1\\ F_X([-\infty,\infty]) &= 1 \end{align*} \]
then we say that it is a probability measure since it yields the probability that an observation of a random variable \(X\) will fall within the Borel set that is its argument. If the cumulative distribution function \(C_X\) of that random variable, which returns the probability that observations of it will be less than or equal to some particular value, is continuous then it will also be the case that
\[ F_X([x, x]) = m^\ast([x, x]) = 0 \]
for any number \(x\). In such cases we can easily define \(F_X\) in terms of \(C_X\) with
\[ F_X(A) = \sum_{i=1}^\infty C_X\left(b_i\right) - C_X\left(a_i\right) \]
where \(\left(a_i,b_i\right)\) are disjoint open intervals that include every element of \(A\) except those at a countable number of closed bounds and
\[ \sum_{i=1}^\infty b_i-a_i = m^\ast(A) \]
or, in other words, that
\[ \bigcup_{i=1}^\infty \left(a_i, b_i\right) \]
is a union of open intervals that most closely covers \(A\), as proven in derivation 5.

Derivation 5: fX Is A Probability Measure
Firstly, we have
\[ \begin{align*} C_X(-\infty) &= \Pr(X \leqslant -\infty) = 0\\ C_X(+\infty) &= \Pr(X \leqslant +\infty) = 1 \end{align*} \]
and so
\[ \begin{align*} F_X\left([-\infty, \infty]\right) &= F_X\left([-\infty, -\infty] \cup (-\infty, \infty) \cup [\infty, \infty]\right)\\ &= F_X\left([-\infty, -\infty]\right) + F_X\left((-\infty, \infty)\right) + F_X\left([\infty, \infty]\right)\\ &= 0 + \left(C_X(\infty) - C_X(-\infty)\right) + 0 = 1-0 = 1 \end{align*} \]
Next, we have
\[ a < b \rightarrow C_X(a) \leqslant C_X(b) \rightarrow C_X(b) - C_X(a) \geqslant 0 \]
from which it follows that \(F_X(A) \geqslant 0\) for any Borel set \(A\).

Now, for any pair of Borel sets \(A\) and \(B\), \(A-B\) and \(B\) are trivially disjoint and therefore
\[ B \subseteq A \rightarrow (A-B) \cup B = A \rightarrow F_X(A-B) + F_X(B) = F_X(A) \rightarrow F_X(B) \leqslant F_X(A) \]
In particular, this means that
\[ F_X(A) \leqslant F_X([-\infty, \infty]) = 1 \]
Finally, if they overlap and we define
\[ \begin{align*} C &= A \cap B\\ A^\prime &= A-C\\ B^\prime &= B-C\\ \end{align*} \]
then \(A^\prime\), \(B^\prime\) and \(C\) are disjoint and
\[ \begin{align*} A &= A^\prime \cup C\\ B &= B^\prime \cup C\\ A \cup B &= (A^\prime \cup C) \cup (B^\prime \cup C) = A^\prime \cup B^\prime \cup C\\ \\ F_X(A) &= F_X\left(A^\prime\right) + F_X(C)\\ F_X(B) &= F_X\left(B^\prime\right) + F_X(C)\\ F_X(A \cup B) &= F_X\left(A^\prime\right) + F_X\left(B^\prime\right) + F_X(C) \leqslant F_X\left(A^\prime\right) + F_X\left(B^\prime\right) + 2 \times F_X(C) = F_X(A) + F_X(B) \end{align*} \]
and, furthermore, if \(A\) and \(B\) are disjoint so that \(C\) is the empty set and \(F_X(C)\) equals zero
\[ \begin{align*} F_X(A) &= F_X\left(A^\prime\right)\\ F_X(B) &= F_X\left(B^\prime\right)\\ F_X(A \cup B) &= F_X\left(A^\prime\right) + F_X\left(B^\prime\right) = F_X(A) + F_X(B) \end{align*} \]
as required.

We can turn this idea upon its head and state that a function \(C_X\) is the cumulative distribution function of a continuous random variable \(X\) if and only if
\[ C_X(x) = F_X\left([-\infty, x]\right) \]
for a continuous probability measure \(F_X\).

This is, in fact, a foundational concept in probability theory but a thorough exploration of it is beyond the scope of this post. Instead, we shall get on with the business of programming and implement a function to calculate measures of ak.borelSet objects.

ak.borelMeasure

After the subtlety of the definitions of measures of Borel sets, the implementation of the ak.borelMeasure to calculate them is rather mundane, simply iterating over the ak.borelInterval objects that define them and adding up the results of a function of their lower and upper bounds, as shown by listing 3.

Listing 3: ak.borelMeasure
ak.borelMeasure = function(s, f) {
 var m = 0;
 var n, i, l, u;

 if(ak.type(s)!==ak.BOREL_SET_T) {
  throw new Error('invalid set in ak.borelMeasure');
 }

 if(ak.nativeType(f)!==ak.FUNCTION_T) {
  throw new Error('invalid measure in ak.borelMeasure');
 }

 n = s.intervals();
 while(n-->0) {
  i = s.at(n);
  l = i.lb(); l = l.closed() ? l.value() : ak.nextAfter(l.value(),  ak.INFINITY);
  u = i.ub(); u = u.closed() ? u.value() : ak.nextAfter(u.value(), -ak.INFINITY);
  m += f(l, u);
 }
 return m;
};

A notable departure that this takes from the mathematical definition of a measure is that it recognises the fact that floating point numbers are not the same as real numbers and that an open bound is equivalent to a closed bound at the next representable number that is enclosed by it, as determined by our ak.nextAfter function.
Program 3 demonstrates the difference this makes for the outer measure and the probability measure of the standard normal distribution, implemented using our ak.normalCDF function, when we change the bounds of the intervals in a Borel set from closed to open.

Program 3: Using ak.borelMeasure

Mathematically, this is arguably a step in the wrong direction since it unnecessarily adds numerical errors to the results. Numerically, however, it can be rather useful to distinguish between open and closed bounds when measuring Borel sets.

Having Trouble Integrating?

We can trivially express the width of an open interval \((a,b)\) as the integral
\[ \int_a^b 1 \, \mathrm{d}x = [x]_a^b = b - a \]
and consequently the outer measure as
\[ m^\ast(A) = \sum_{i=1}^\infty \int_{a_i}^{b_i} 1 \, \mathrm{d}x \]
where \(\left(a_i,b_i\right)\) are a minimal set of disjoint open intervals that cover \(A\). Similarly, if the probability density function of a random variable \(X\) is \(p_X\) then its probability measure can be expressed as
\[ F_X(A) = \sum_{i=1}^\infty \int_{a_i}^{b_i} p_X(x) \, \mathrm{d}x = \sum_{i=1}^\infty C_X\left(b_i\right) - C_X\left(a_i\right) \]
It is rather tempting, therefore, to implement a function to compute the integral of a function over the intervals representing a Borel set.

The chief advantage that our treatment of open bounds brings to such integrals is that it allows us to remove problematic points from their calculation. In particular, numerical integration algorithms do not tend to handle discontinuities very well, as demonstrated by program 4 which attempts to use our ak.rombergIntegral function to integrate
\[ f_a(x) = \begin{cases} 0.5 & x \leqslant a\\ 1.0 & x > a \end{cases} \]
which trivially has a discontinuity at \(a\).

Program 4: The Trouble With Discontinuities

Infinite bounds also cause problems for numerical integration algorithms since any attempt to find a point that is some fraction of the way into an infinite range will yield infinity. We could try using ak.nextAfter to convert closed infinite bounds into open finite ones, but as it happens we can do so more effectively with a change of variables.
For example, we can solve the integral
\[ \int_1^\infty f(x) \mathrm{d}x \]
with the substitution
\[ \begin{align*} y &= \frac1x\\ \frac{\mathrm{d}y}{\mathrm{d}x} &= -\frac{1}{x^2} = -y^2\\ \mathrm{d}x &= -\frac{\mathrm{d}y}{y^2} \end{align*} \]
so that
\[ \int_1^\infty f(x) \mathrm{d}x = \int_{\frac11}^{\frac1\infty} f\left(\frac1y\right) \times -\frac{\mathrm{d}y}{y^2} = \int_{1}^{0} -\frac{f\left(\frac1y\right)}{y^2} \mathrm{d}y = \int_{0}^{1} \frac{f\left(\frac1y\right)}{y^2} \mathrm{d}y \]
This will only be finite if \(f(\infty)\) is zero and so, to avoid dividing zero by zero in such cases, we must treat the integrand as zero whenever its numerator is zero, as is done by the changeVar function in listing 4.

Listing 4: Changing The Variable
function changeVar(f) {
 return function(y) {
  var x = 1/y;
  var fx = f(x);
  return fx!==0 ? fx*x*x : 0;
 };
}

The ak.borelIntegral given in listing 5 applies this substitution to create an integral h of the function f together with a direct integral g using the numerical integral, defaulting to ak.rombergIntegral, having the given convergence threshold and maximum number of steps.

Listing 5: ak.borelIntegral
ak.borelIntegral = function(s, f, integral, threshold, steps) {
 var m = 0;
 var g, h, n;

 if(ak.type(s)!==ak.BOREL_SET_T) {
  throw new Error('invalid set in ak.borelIntegral');
 }

 if(ak.nativeType(f)!==ak.FUNCTION_T) {
  throw new Error('invalid function in ak.borelIntegral');
 }

 if(ak.nativeType(integral)===ak.UNDEFINED_T) {
  integral = ak.rombergIntegral;
 }
 else if(ak.nativeType(integral)!==ak.FUNCTION_T) {
  throw new Error('invalid integral in ak.borelIntegral');
 }

 g = integral(f, threshold, steps);
 h = integral(changeVar(f), threshold, steps);
 n = s.intervals();
 while(n-->0) m += integrate(s.at(n), g, h);
 return m;
};

It then simply iterates over the ak.borelInterval objects in the ak.borelSet argument s, adding up the results of the integrate helper function that numerically integrates f over them, given in listing 6, using the appropriate forms of the integral.

Listing 6: The integrate Helper Function
function integrate(i, g, h) {
 var l = i.lb();
 var u = i.ub();
 var lv = l.value();
 var uv = u.value();
 var lf, uf;

 if(l.open()) lv = ak.nextAfter(lv,  ak.INFINITY);
 if(u.open()) uv = ak.nextAfter(uv, -ak.INFINITY);

 if(lv===uv) return 0;

 lf = isFinite(lv);
 uf = isFinite(uv);
 if(lf && uf) return g(lv, uv);
 if(lf) return lv<1 ? g(lv, 1) + h(1/uv, 1) : h(1/uv, 1/lv);
 if(uf) return uv>-1 ? h(-1, 1/lv) + g(-1, uv) : h(1/uv, 1/lv);
 return h(-1, 1/lv) + g(-1, 1) + h(1/uv, 1);
}

Specifically, if both bounds are finite then we return the untransformed integral over them, or the next included numbers if they are open.
If the lower bound is finite and the upper bound is infinite then we use both the untransformed and transformed integrals to calculate
\[ \int_a^b f(x) \, \mathrm{d}x = \begin{cases} \int_a^1 f(x) \, \mathrm{d}x + \int_{\frac1b}^1 \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y & a < 1\\ \int_{\frac1b}^{\frac1a} \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y & a \geqslant 1 \end{cases} \]
ensuring that the upper bound of the transformed integral is less than or equal to one. We use a similar trick when the upper bound is finite and the lower bound is infinite
\[ \int_a^b f(x) \, \mathrm{d}x = \begin{cases} \int_{-1}^b f(x) \, \mathrm{d}x + \int_{-1}^{\frac1a} \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y & b > -1\\ \int_{\frac1b}^{\frac1a} \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y & b \leqslant -1 \end{cases} \]
Finally, if both bounds are infinite, we use them to calculate
\[ \int_a^b f(x) \, \mathrm{d}x = \int_{-1}^{\frac1a} \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y + \int_{-1}^1 f(x) \, \mathrm{d}x + \int_{\frac1b}^1 \frac{f\left(\frac1y\right)}{y^2} \, \mathrm{d}y \]
Program 5 demonstrates how we can remove the point at which \(f_a\) is discontinuous so that we can integrate it with ak.borelIntegral.

Program 5: Dealing With Discontinuities

The reason for transforming open bounds into closed ones in ak.borelMeasure, even though it doesn't suffer from the same problem as the integral, is to ensure that its behaviour is consistent with ak.borelIntegral.

Finally, program 6 demonstrates how it handles infinite bounds by using it to integrate our ak.normalPDF implementation of the normal probability density function and comparing it to the normal probability measure.

Program 6: Dealing With Infinities

You can find the implementations of the ak.boralMeasure and ak.borelIntegral functions in BorelMeasure.js and BorelIntegral.js respectively.

Now it should be noted that the notion of measure is vastly more important to calculus and probability theory than simply providing a convenient way to approximate some integrals of functions that would otherwise prove difficult for numerical integration algorithms. We won't go into the details this time, but may return to the subject in a future post.

References

[1] A Decent Borel Code, www.thusspakeak.com, 2018.

[2] A Borel Universe, www.thusspakeak.com, 2018.

[3] Let's Talk About Sets, www.thusspakeak.com, 2018.

Leave a comment