At the beginning of the second half of this series of posts^{[1]} we briefly covered the history of the differential calculus and described the incredibly useful Taylor's theorem, which states that for a function \(f\)
We have so far used it to perform a full error analysis of finite difference algorithms^{[2]} and to automate the calculation of their terms with polynomial curve fitting^{[3]}.
We went on to describe the far more accurate polynomial approximation of Ridders' algorithm^{[4]} which treated the symmetric finite difference as a function of the difference \(\delta\).
In the previous post^{[5]} we used expression objects to represent functions as expression trees, from which it was possible to compute expression representing their derivative by applying the relatively simple algebraic rules of differentiation.
Noting that there was little we could do to control cancellation error in the unknown expression for the derivative, assuming we hadn't the appetite to implement a full blown computer algebra system that is, we concluded by combining this with interval arithmetic^{[6]} to automatically keep track of such numerical errors.
Having claimed that short of just such a system our algorithm was as accurate as it was possible to get, you may be a little surprised that there is anything left to discuss.
However, as I also noted, expression objects are fairly memory intensive due to the need to maintain the expression tree and I should ideally prefer something more lightweight.
The question stands as to whether such an algorithm exists.
Given that I'm still writing, it should be more or less selfevident that it does.
And, in another testament to its tremendous power, it's based entirely upon Taylor's theorem.
So what's to stop us from actually using infinitesimals in our calculations?
Nothing. That's what!
Much as we extend the reals to the complex numbers by representing numbers with both a real and an imaginary part, we can extend the reals to the surreals by representing numbers with both a real and an infinitesimal part.
Negation, addition and subtraction are trivial operations in which we perform the same action on both parts of the number
Well, appearances can be deceptive.
What we shall therefore do is work out the Taylor series expansion of reciprocation to first order. We can then multiply the numerator by the reciprocal of the denominator. The required series for the reciprocal is given by
Hence our rule for reciprocation is
The rule for division is therefore given by
Well, the answer, perhaps a little unsurprisingly, lies in Taylor's theorem.
The surreal result of a function is a first order Taylor series expansion about an infinitesimal deviation from the value of that function. By Taylor's theorem, the coefficient by which \(\delta\) is multiplied in the result of a function is therefore equal to its first derivative.
For example consider
This use of the Taylor series to compute derivatives is known as automatic differentiation, as the title of this article might have hinted at.
Fortunately we can generalise their definition to facilitate such calculations. Specifically we define a surreal number of order n as
To see how this can help us recover higher order derivatives, consider the \(n^\mathrm{th}\) order truncated Taylor series of a function \(f\)
Given that this is an arithmetic type, we've defined a
The
Next up are constructors which specify the order of the surreal, dispatched from that defined in listing 3.
An important point to note is that a 0^{th} order surreal still has one coefficient, that of its real part.
The order constructors that define real and first order infinitesimal coefficients are given in listing 4.
Next we have an order constructor that initialises the coefficients with a function taking their order as an argument, shown in listing 5.
Finally, listing 6 defines a constructor that initialises an
By similar, I mean that it must support an
The
Note that the \(i^{th}\) element of the
In contrast, since the
The remaining methods provide conversion to strings using standard JavaScript numeric string conversions by forwarding to a
Here we have our first example of the mixed type arithmetic that was central to my argument that my dispatch mechanism for
Note that it does not make sense to allow mixed arithmetic with surreals of different orders since this will almost always be a programming error, hence the exception thrown if
Subtraction is no less trivial, as illustrated in listing 10.
Next up is multiplication which, with reals at least, is straightforward, as shown in listing 11.
Multiplying a pair of surreals is a slightly more fiddly prospect since there will generally be multiple terms on the left and right hand sides which when multiplied yield an infinitesimal of a given order. Essentially what we need to do is multiply two polynomials in \(\delta\). The process for polynomial multiplication is known as a discrete convolution and takes the form
As noted then there are far more efficient ways to perform this calculation than naively implementing the above formula, but I'm going to keep things simple. We shall at least arrange to apply the process inplace by iterating backwards over the coefficients of the result, as shown in listing 12.
Note that we can only get away with this because we are discarding all terms above \(n^\mathrm{th}\) order.
Division of a surreal by a real is another coefficient by coefficient affair, as illustrated by listing 13.
Division by surreals is distinctly more troublesome than the operations defined thus far. We shall implement it by defining a reciprocal function
For the \(\mathrm{inv}\) function this equates to
The argument
Now this is a fairly terse bit of code so it might not be immediately obvious exactly what it's doing. The important things to note are that in the \(i^\mathrm{th}\) power of the infinitesimal terms there can be no terms of lower order than \(\delta^i\) and that we are not interested in any that are of higher order than \(\delta^n\).
By working backwards from the \(n^\mathrm{th}\) to the \(i^\mathrm{th}\) terms we can compute the result in place since each term must be multiplied by \(\delta\) at least once.
Finally, to convince yourself that the inner loop is correct simply substitute the lower and upper limits of
This should make it clear that the left and right hand sides of the sums are both of order \(\delta^{nj}\).
Finally, substituting the lower and upper limits of
The two expressions are identical for the upper limit of
Hopefully that's made everything clear because I can't think of any way of making it clearer!
Implementing \(\mathrm{inv}\) in terms of
So finally we can implement division by surreals, as shown in listing 17.
The first step is, of course, to work out the Taylor series expansions of these functions. Fortunately this isn't very difficult; the Taylor series of the exponential is given by
Unfortunately not all arithmetic operations have such simple Taylor series expansions. For example, consider the inverse of the cosine, \(\arccos(x)\). Its first derivative is
The bulk of the
To demonstrate, program 1 plots the difference between the surreal and symbolic first and second derivatives of an arbitrary function as red and blue lines respectively.
That you're seeing a purple line at zero is because surreal differentiation is no less accurate than symbolic differentiation, although you might want to edit the function
I must admit that I have something of a soft spot for this approach having independently discovered it a few years ago. Even though I subsequently discovered that I wasn't the first, I still find it to be more satisfying than symbolic differentiation and it is my weapon of choice for difficult, by which I mean extremely lengthy, differentiation calculations.
The solution is more or less the same as that we used for expression objects; we must simply compute the derivatives explicitly and use them to calculate the coefficients of the function's Taylor series. As with expression objects we can do this either using a general purpose algorithm such as the polynomial approximation, or with a specific approximation of the derivative of the function, or, if we're very fortunate, with the actual formula for the derivative.
Clearly this is no better or no worse than taking the same approach with expression objects, but somehow it seems a more natural union to me. With automatic differentiation we are not trying to maintain an exact representation of the mathematical expression so resorting to approximation doesn't seem quite so jarring.
Although, as I have already admitted, I am a little biased.
This rather surprising result can be demonstrated using, you guessed it, Taylor’s theorem!
The Taylor series expansions of the numerator and denominator are
Since our surreal type keeps track of the coefficients of the Taylor series we can easily extend division to exploit L'Hopital's rule, allowing us to correctly evaluate such ratios. By way of an example, consider the ratio of two fourth order surreals whose first three coefficients are zero
The first loop searches for the first index
Here, finally, is the reason why I chose to return NaNs rather than throw exceptions when reading past the end of the coefficients; L'Hopital's rule results in the loss of higher order coefficients and rather than change the order of the result and be forced to implement mixed order arithmetic it was far simpler to just return NaNs.
Note that whilst surreal arithmetic is still subject to cancellation error, with a careful choice of
This then is my last word on numerical differentiation and I hope I have successfully demonstrated my point that to develop accurate numerical algorithms you will often need to think carefully about the underlying mathematics of the expressions that you're trying to approximate.
Or, in other words, that you're going to have to think!
[2] You're Going To Have To Think! Why Finite Differences Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[3] You're Going To Have To Think! Why Polynomial Approximation Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[4] Ridders, C.J.F., Advances in Engineering Software, Vol. 4. No. 2., Elsevier, 1982.
[5] You're Going To Have To Think! Why Computer Algebra Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[6] You're Going To Have To Think! Why Interval Arithmetic Won't Cure Your Floating Point Blues, www.thusspakeak.com, 2013.
[7] Robinson, A. Nonstandard Analysis (New Ed.), Princeton University Press, 1996.
[8] Knuth, D. Surreal Numbers; How Two ExStudents Turned on to Pure Mathematics and Found Total Happiness, Addison Wesley, 1974.
[9] Climbing Mount Impractical, www.thusspakeak.com, 2013.
[10] You're Going To Have To Think! Why Arbitrary Precision Arithmetic Won't Cure Your Floating Point Blues, www.thusspakeak.com, 2013.
\[
f(x+\delta) = f(x) + \delta \times f'(x) + \tfrac{1}{2}\delta^2 \times f''(x) + \dots + \tfrac{1}{n!} \delta^n \times f^{(n)}(x) + R_n\\
\min\left(\tfrac{1}{(n+1)!} \delta^{(n+1)} \times f^{(n+1)}(x+\theta\delta)\right) \leqslant R_n \leqslant \max\left(\tfrac{1}{(n+1)!} \delta^{(n+1)} \times f^{(n+1)}(x+\theta\delta)\right) \; \mathrm{for} \; 0 \leqslant \theta \leqslant 1
\]
where the exclamation mark stands for the factorial, \(f'(x)\) the first derivative of \(f\) at \(x\), \(f''(x)\) the second and \(f^{(n)}(x)\) the \(n^\mathrm{th}\) and where, by convention, the \(0^\mathrm{th}\) derivative is the function itself.We have so far used it to perform a full error analysis of finite difference algorithms^{[2]} and to automate the calculation of their terms with polynomial curve fitting^{[3]}.
We went on to describe the far more accurate polynomial approximation of Ridders' algorithm^{[4]} which treated the symmetric finite difference as a function of the difference \(\delta\).
In the previous post^{[5]} we used expression objects to represent functions as expression trees, from which it was possible to compute expression representing their derivative by applying the relatively simple algebraic rules of differentiation.
Noting that there was little we could do to control cancellation error in the unknown expression for the derivative, assuming we hadn't the appetite to implement a full blown computer algebra system that is, we concluded by combining this with interval arithmetic^{[6]} to automatically keep track of such numerical errors.
Having claimed that short of just such a system our algorithm was as accurate as it was possible to get, you may be a little surprised that there is anything left to discuss.
However, as I also noted, expression objects are fairly memory intensive due to the need to maintain the expression tree and I should ideally prefer something more lightweight.
The question stands as to whether such an algorithm exists.
Given that I'm still writing, it should be more or less selfevident that it does.
And, in another testament to its tremendous power, it's based entirely upon Taylor's theorem.
Surreal Numbers
Like Robinson's nonstandard numbers^{[7]}, Conway's surreal numbers^{[8]} extend the reals in a way that admits a rigorous definition of infinitesimals.So what's to stop us from actually using infinitesimals in our calculations?
Nothing. That's what!
Much as we extend the reals to the complex numbers by representing numbers with both a real and an imaginary part, we can extend the reals to the surreals by representing numbers with both a real and an infinitesimal part.
\[
w = a + b \delta
\]
Note that we don't actually need to define \(\delta\) so long as it stands for the same infinitesimal throughout our calculations.Negation, addition and subtraction are trivial operations in which we perform the same action on both parts of the number
\[
\begin{align*}
w_1 &= a_1 + b_1 \delta\\
w_2 &= a_2 + b_2 \delta\\
\\
w &= a b\delta\\
w_1 + w_2 &= (a_1+a_2) + (b_1+b_2) \delta\\
w_1  w_2 &= (a_1a_2) + (b_1b_2) \delta
\end{align*}
\]
Multiplication proceeds much as it does for complex numbers, although we shall discard the term involving the square of \(\delta\).
\[
\begin{align*}
w_1 \times w_2 &= (a_1+b_1\delta) + (a_2+b_2\delta)\\
&= a_1 \times a_2 + a_1 \times b_2 \delta + a_2 \times b_1 \delta\\
&= (a_1 \times a_2) + (a_1 \times b_2 + a_2 \times b_1) \delta
\end{align*}
\]
Division would seem to be a slightly trickier proposition; we're ignoring second and higher order powers of \(\delta\). If the product of two of our surreal numbers has a second order term, surely we'll need it when dividing the product by one of the arguments if we are to have any chance of recovering the other. Or of recovering the correct result when dividing it by any other surreal number, for that matter.Well, appearances can be deceptive.
Surreal Number Division
The trick is one we're going to exploit time and again when defining arithmetic operations for these numbers; no matter what real number we multiply \(\delta\) by, it's still infinitesimal, which means that we can still use Taylor's theorem!What we shall therefore do is work out the Taylor series expansion of reciprocation to first order. We can then multiply the numerator by the reciprocal of the denominator. The required series for the reciprocal is given by
\[
\frac{1}{a + b\delta} = \frac{1}{a} + b\delta \times \frac{\mathrm{d}}{\mathrm{d}x} \frac{1}{x} \Bigg_a
\]
where the vertical bar means "evaluated at the subscript".Hence our rule for reciprocation is
\[
\frac{1}{a + b\delta} = \frac{1}{a}  \frac{b}{a^2}\delta
\]
Note that this trick is effectively the chain rule for surreal numbers in that it allows us to calculate the derivative of an expression in which one function is applied to the result of another.The rule for division is therefore given by
\[
\begin{align*}
\frac{w_1}{w_2}
&= w_1 \times \frac{1}{w_2}\\
&= (a_1 + b_1\delta) \times \left(\frac{1}{a_2}  \frac{b_2}{a_2^2}\delta\right)\\
&= \frac{a_1}{a_2} + \left(\frac{b_1}{a_2}  \frac{a_1 \times b_2}{a_2^2}\right)\delta
\end{align*}
\]
Now something about this sidestepping of the need to retain higher order infinitesimals seems a little bit fishy. To be truly comfortable that I haven't done something silly we should probably check that multiplying the ratio of \(w_1\) and \(w_2\) by \(w_2\) yields \(w_1\) as expected.
\[
\begin{align*}
\frac{w_1}{w_2} \times w_2
&= \left(\frac{a_1}{a_2} + \left(\frac{b_1}{a_2}  \frac{a_1 \times b_2}{a_2^2}\right)\delta\right) \times (a_2 + b_2\delta)\\
&= \frac{a_1}{a_2} \times a_2 + a_2 \times \left(\frac{b_1}{a_2}  \frac{a_1 \times b_2}{a_2^2}\right)\delta + \frac{a_1}{a_2} \times b_2 \delta\\
&= a_1 + \left(b_1  \frac{a_1 \times b_2}{a_2}\right) \delta + \frac{a_1 \times b_2}{a_2} \delta\\
&= a_1 + b_1 \delta
\end{align*}
\]
Well, it's clear that my reservations were undeserved; I really should have had more faith in Taylor's remarkable result.
Whither Algorithm?
Now this consistent extension of real arithmetic is all well and good, but how does it help us to develop a machine precision accurate algorithm for computing derivatives?Well, the answer, perhaps a little unsurprisingly, lies in Taylor's theorem.
The surreal result of a function is a first order Taylor series expansion about an infinitesimal deviation from the value of that function. By Taylor's theorem, the coefficient by which \(\delta\) is multiplied in the result of a function is therefore equal to its first derivative.
For example consider
\[
\frac{\mathrm{d}}{\mathrm{d}x} \frac{1}{x^2} \Bigg_2 = \frac{2}{x^3} \Bigg_2 = \frac{2}{8} = \frac{1}{4}
\]
Using surreals we can simply divide 1 by the square of an infinitesimal deviation from 2
\[
\begin{align*}
\frac{1}{(2+\delta) \times (2+\delta)}
&= \frac{1}{4 + 4\delta}\\
&= \frac{1}{4}  \frac{4}{4^2}\delta\\
&= \frac{1}{4}  \frac{1}{4}\delta\\
\end{align*}
\]
Clearly the coefficient of the infinitesimal part is equal to the required derivative.This use of the Taylor series to compute derivatives is known as automatic differentiation, as the title of this article might have hinted at.
Higher Order Derivatives
One thing I would like to be able to do is compute higher order derivatives and an arithmetic type representing surreal numbers as currently defined won't be much help.Fortunately we can generalise their definition to facilitate such calculations. Specifically we define a surreal number of order n as
\[
w = a + \sum_{i=1}^n b_i \delta^i
\]
where the \(\sum\) represents the sum of the terms to its right for the range of indices below and above it.To see how this can help us recover higher order derivatives, consider the \(n^\mathrm{th}\) order truncated Taylor series of a function \(f\)
\[
f(x+\delta) = f(x) + \sum_{i=1}^n \tfrac{1}{i!} \delta^i f^{(i)}(x)
\]
so if our \(w\) is the result of some function of an \(n^\mathrm{th}\) order surreal equal to \(x+\delta\) (i.e. with all but the first surreal coefficient \(b_i\) equal to zero) we have
\[
\begin{align*}
a &= f(x)\\
b_i &= \tfrac{1}{i!} f^{(i)}
\end{align*}
\]
A Surreal Class
Theak
implementation of a surreal type is given in listing 1.Listing 1: ak.surreal
ak.SURREAL_T = 'ak.surreal'; function Surreal(){} Surreal.prototype = {TYPE: ak.SURREAL_T, valueOf: function(){return ak.NaN;}}; ak.surreal = function() { var s = new Surreal(); var state = []; var arg0 = arguments[0]; constructors[ak.nativeType(arg0)](state, arg0, arguments); s.order = function() {return state.length1;}; s.coeff = function(i) {return state[Number(i)]+0;}; s.coeffs = function() {return state.slice(0);}; s.deriv = function(i) {return fact(Number(i))*state[Number(i)];}; s.derivs = function() {return derivs(state);}; s.toString = function() { return toString(state, Number.prototype.toString); }; s.toExponential = function(d) { return toString(state, Number.prototype.toExponential, d); }; s.toFixed = function(d) { return toString(state, Number.prototype.toFixed, d); }; s.toPrecision = function(d) { return toString(state, Number.prototype.toPrecision, d); }; return Object.freeze(s); }; var constructors = {};
Given that this is an arithmetic type, we've defined a
TYPE
member that we can use to dispatch operations to the appropriate functions via our overloading mechanism^{[9]}.
The
state
variable holds the coefficients of the surreal number, the first representing \(a\) and the other \(i^\mathrm{th}\) representing \(b_i\). As usual, initialisation of the state is handled by dynamic dispatch to methods of a constructors
object with accessor methods added afterwards.
ak.surreal Constructors
The first constructor initialises the state with an array of coefficients, as shown in listing 2.Listing 2: ak.surreal Array Constructor
constructors[ak.ARRAY_T] = function(state, arr) { var n = arr.length; var i; if(n===0) throw new Error('no coefficients in ak.surreal constructor') for(i=0;i<n;++i) state[i] = Number(arr[i]); };
Next up are constructors which specify the order of the surreal, dispatched from that defined in listing 3.
Listing 3: ak.surreal Order Constructor
constructors[ak.NUMBER_T] = function(state, n, args) { var arg1 = args[1]; n = ak.trunc(n); if(n<0) throw new Error('negative order in ak.surreal constructor'); state.length = n+1; constructors[ak.NUMBER_T][ak.nativeType(arg1)](state, arg1, args); };
An important point to note is that a 0^{th} order surreal still has one coefficient, that of its real part.
The order constructors that define real and first order infinitesimal coefficients are given in listing 4.
Listing 4: ak.surreal Coefficient Constructor
constructors[ak.NUMBER_T][ak.UNDEFINED_T] = function(state) { var n = state.length; var i; for(i=0;i<n;++i) state[i] = 0; }; constructors[ak.NUMBER_T][ak.NUMBER_T] = function(state, x, args) { var arg2 = args[2]; constructors[ak.NUMBER_T][ak.NUMBER_T][ak.nativeType(arg2)](state, x, arg2); }; constructors[ak.NUMBER_T][ak.NUMBER_T][ak.UNDEFINED_T] = function(state, x) { var n = state.length; var i; state[0] = x; for(i=1;i<n;++i) state[i] = 0; }; constructors[ak.NUMBER_T][ak.NUMBER_T][ak.NUMBER_T] = function(state, x, d) { var n = state.length; var i; state[0] = x; if(n>1) state[1] = d; for(i=2;i<n;++i) state[i] = 0; };
Next we have an order constructor that initialises the coefficients with a function taking their order as an argument, shown in listing 5.
Listing 5: ak.surreal Function Constructor
constructors[ak.NUMBER_T][ak.FUNCTION_T] = function(state, func) { var n = state.length; var i; for(i=0;i<n;++i) state[i] = Number(func(i)); };
Finally, listing 6 defines a constructor that initialises an
ak.surreal
with another object supporting a similar interface.Listing 6: ak.surreal Object Constructor
constructors[ak.OBJECT_T] = function(state, obj) { var n = obj.order; var i; n = (ak.nativeType(n)===ak.FUNCTION_T) ? Number(n()) : Number(n); if(n<0  n>ak.INT_MAX1) throw new Error('invalid order in ak.surreal constructor'); state.length = n+1; for(i=0;i<=n;++i) state[i] = Number(obj.coeff(i)); };
By similar, I mean that it must support an
order
method or member and a coeff
method that are analagous to those defined below for ak.surreal
.
ak.surreal Methods
Theorder
, coeff
and coeffs
methods are relatively simple. The first returns the order of the surreal which, by definition, is one less than the number of coefficients. The second returns the \(i^{th}\) coefficient, with the addition of 0 coercing undefined values that result from reading past the end of the coefficients to NaN, for reasons that will become clear later. The last simply returns the coefficients in an array.The
deriv
and derivs
methods provide direct access to the values of the derivatives recoverable from the surreal. The former simply multiplies the term in question by the factorial of its order, as returned by fact
, and the latter forwards to derivs
to transform the terms into derivative, as shown in listing 7.Listing 7: ak.surreal String Conversion
var factCache = [1]; function fact(i) { var n = factCache.length; while(n++<=i) factCache[n1] = (n1)*factCache[n2]; return factCache[i]; } function derivs(terms) { var fac = 1; var n = terms.length; var i; terms = terms.slice(0); for(i=0;i<n;++i) { terms[i] *= fac; fac *= i+1; } return terms; }
Note that the \(i^{th}\) element of the
factCache
array stores \(i!\) and is sequentially populated by fact
if it doesn't yet contain the requested factorial, ensuring that each must only be calculated once, reducing the computational cost from \(O(i)\) to \(O(1)\) for repeated calculations, albeit at an increased cost in memory. That I'm happy to accept that increased cost is a consequence of an expectation that derivatives of the same order are likely to be required many times during the course of a program, and that very high order derivatives are unlikely to be needed at all.In contrast, since the
derivs
function needs every factorial up to the order of the surreal coefficients, it can afford to calculate each of them in sequence as it iterates through the terms.The remaining methods provide conversion to strings using standard JavaScript numeric string conversions by forwarding to a
toString
function given in listing 8.Listing 8: ak.surreal String Conversion
function toString(state, f, d) { var n = state.length; var s = []; var i; s.push(f.call(state[0], d)); if(n>1) { if(state[1]<0) s.push('  ' + f.call(state[1], d) + 'd'); else s.push(' + ' + f.call( state[1], d) + 'd'); } for(i=2;i<n;++i) { if(state[i]<0) s.push('  ' + f.call(state[i], d) + 'd^' + i); else s.push(' + ' + f.call( state[i], d) + 'd^' + i); } return s.join(''); }
Surreal Arithmetic
Now that we have a surreal type all that remains is to work out how to apply arithmetic operations to it. Addition is trivial, as shown in listing 9.Listing 9: ak.surreal Addition
function add(s0, s1) { var n = s0.order(); var s = s0.coeffs(); var i; if(s1.order()!==n) throw new Error('order mismatch in ak.surreal add'); for(i=0;i<=n;++i) s[i] += s1.coeff(i); return ak.surreal(s); } function addSR(s, r) { s = s.coeffs(); s[0] += r; return ak.surreal(s); } function addRS(r, s) { s = s.coeffs(); s[0] += r; return ak.surreal(s); } ak.overload(ak.add, [ak.SURREAL_T, ak.SURREAL_T], add); ak.overload(ak.add, [ak.SURREAL_T, ak.NUMBER_T], addSR); ak.overload(ak.add, [ak.NUMBER_T, ak.SURREAL_T], addRS);
Here we have our first example of the mixed type arithmetic that was central to my argument that my dispatch mechanism for
ak
arithmetic operations was far and away the best design choice available. Given that the surreals are an extension of the reals, in the sense that surreal numbers with zeros for all their infinitesimal coefficients are arithmetically indistinguishable from real numbers, it makes sense mathematically to equate the latter with the former so as to formalise surreal/real arithmetic. Hence we allow mixed arithmetic with JavaScript numbers and ak.surreal
values under the assumption that the former represent reals.Note that it does not make sense to allow mixed arithmetic with surreals of different orders since this will almost always be a programming error, hence the exception thrown if
ak.surreal
values of different orders are added.Subtraction is no less trivial, as illustrated in listing 10.
Listing 10: ak.surreal Subtraction
function sub(s0, s1) { var n = s0.order(); var s = s0.coeffs(); var i; if(s1.order()!==n) throw new Error('order mismatch in ak.surreal sub'); for(i=0;i<=n;++i) s[i] = s1.coeff(i); return ak.surreal(s); } function subSR(s, r) { s = s.coeffs(); s[0] = r; return ak.surreal(s); } function subRS(r, s) { var n = s.order(); var i; s = s.coeffs(); s[0] = r  s[0]; for(i=1;i<=n;++i) s[i] = s[i]; return ak.surreal(s); } ak.overload(ak.sub, [ak.SURREAL_T, ak.SURREAL_T], sub); ak.overload(ak.sub, [ak.SURREAL_T, ak.NUMBER_T], subSR); ak.overload(ak.sub, [ak.NUMBER_T, ak.SURREAL_T], subRS);
Next up is multiplication which, with reals at least, is straightforward, as shown in listing 11.
Listing 11: ak.surreal Multiplication
function mulSR(s, r) { var n = s.order(); var i; s = s.coeffs(); for(i=0;i<=n;++i) s[i] *= r; return ak.surreal(s); } function mulRS(r, s) { var n = s.order(); var i; s = s.coeffs(); for(i=0;i<=n;++i) s[i] *= r; return ak.surreal(s); } ak.overload(ak.mul, [ak.SURREAL_T, ak.NUMBER_T], mulSR); ak.overload(ak.mul, [ak.NUMBER_T, ak.SURREAL_T], mulRS);
Multiplying a pair of surreals is a slightly more fiddly prospect since there will generally be multiple terms on the left and right hand sides which when multiplied yield an infinitesimal of a given order. Essentially what we need to do is multiply two polynomials in \(\delta\). The process for polynomial multiplication is known as a discrete convolution and takes the form
\[
\sum_{i=0}^n a_i x^i \times \sum_{i=0}^n b_i x^i = \sum_{i=0}^n \sum_{j=0}^i \left(a_j \times b_{ij}\right) x^i
\]
This is more or less how we approached bignum multiplication^{[10]}, albeit without the carries.As noted then there are far more efficient ways to perform this calculation than naively implementing the above formula, but I'm going to keep things simple. We shall at least arrange to apply the process inplace by iterating backwards over the coefficients of the result, as shown in listing 12.
Listing 12: ak.surreal Multiplication
function mul(s0, s1) { var n = s0.order(); var s = s0.coeffs(); var i, j; if(s1.order()!==n) throw new Error('order mismatch in ak.surreal mul'); for(i=0;i<=n;++i) { s[ni] *= s1.coeff(0); for(j=1;j<=ni;++j) { s[ni] += s[nij] * s1.coeff(j); } } return ak.surreal(s); } ak.overload(ak.mul, [ak.SURREAL_T, ak.SURREAL_T], mul);
Note that we can only get away with this because we are discarding all terms above \(n^\mathrm{th}\) order.
Division of a surreal by a real is another coefficient by coefficient affair, as illustrated by listing 13.
Listing 13: ak.surreal Division
function divSR(s, r) { var n = s.order(); var i; s = s.coeffs(); for(i=0;i<=n;++i) s[i] /= r; return ak.surreal(s); } ak.overload(ak.div, [ak.SURREAL_T, ak.NUMBER_T], divSR);
Division by surreals is distinctly more troublesome than the operations defined thus far. We shall implement it by defining a reciprocal function
\[
\mathrm{inv}(w) = \frac{1}{w}
\]
and multiplying the left hand side of the division by the reciprocal of the right hand side
\[
\frac{w_0}{w_1} = w_0 \times \mathrm{inv}(w_1)
\]
Now it may not be immediately obvious how we can propagate all of the powers of \(\delta\) through a mathematical function like \(\mathrm{inv}\), but it turns out that there's a relatively simple trick; \(\sum_{i=1}^n b_i \delta^i\) is itself an infinitesimal! This duly noted, we have
\[
f\left(a + \sum_{i=1}^n b_i \delta^i\right) = f(a) + \left(\sum_{i=1}^n b_i \delta^i\right) f'(a) + \frac{1}{2}\left(\sum_{i=1}^n b_i \delta^i\right)^2 f''(a) + \dots + \frac{1}{n!}\left(\sum_{i=1}^n b_i \delta^i\right)^n f^{(n)}(a)
\]
which is essentially the chain rule expressed with surreal numbers. Evaluating such powers of sums of the infinitesimal terms is a fiddly job, but by no means insurmountable. So as long as we can work out the exact form of the derivatives of a function we can relatively simply generalise it to \(n^\mathrm{th}\) order surreals.For the \(\mathrm{inv}\) function this equates to
\[
\mathrm{inv}\left(a + \sum_{i=1}^n b_i \delta^i\right) = \frac{1}{a}  \left(\sum_{i=1}^n b_i \delta^i\right) \frac{1}{a^2} + \left(\sum_{i=1}^n b_i \delta^i\right)^2 \frac{1}{a^3} + \dots + (1)^n \left(\sum_{i=1}^n b_i \delta^i\right)^n \frac{1}{a^{n+1}}
\]
To compute the result, we shall fill an array with the real and infinitesimal coefficients of the function and pass it along with the surreal argument to a function that applies this for of the chain rule, given in listing 14.Listing 14: Applying The Chain Rule
function chain(c, s) { var di = s.coeffs(); var cs = []; var n = di.length; var i, j; cs.length = n; cs[0] = c[0]; for(j=1;j<n;++j) cs[j] = c[1] * di[j]; for(i=2;i<n;++i) { iPow(s, i, di); for(j=i;j<n;++j) cs[j] += c[i] * di[j]; } return ak.surreal(cs); }
The argument
c
holds the function's coefficients and s
the surreal argument. The variable di
holds the \(i^\mathrm{th}\) power of the sum of the infinitesimal terms of the argument and is updated at each step by the iPow
function as given in listing 15.Listing 15: Updating The Infinitesimal Terms
function iPow(s, i, di) { var n = s.order(); var j, k; for(j=0;j<=ni;++j) { di[nj] = 0; for(k=1;k<=n+1(i+j);++k) di[nj] += di[n(j+k)] * s.coeff(k); } }
Now this is a fairly terse bit of code so it might not be immediately obvious exactly what it's doing. The important things to note are that in the \(i^\mathrm{th}\) power of the infinitesimal terms there can be no terms of lower order than \(\delta^i\) and that we are not interested in any that are of higher order than \(\delta^n\).
By working backwards from the \(n^\mathrm{th}\) to the \(i^\mathrm{th}\) terms we can compute the result in place since each term must be multiplied by \(\delta\) at least once.
Finally, to convince yourself that the inner loop is correct simply substitute the lower and upper limits of
k
in the bodydi[nj] += di[(nj)  1)] * s.coeff(1);
di[nj] += di[i1] * s.coeff((nj)  (i1));
This should make it clear that the left and right hand sides of the sums are both of order \(\delta^{nj}\).
Finally, substituting the lower and upper limits of
j
in these expression shows that we are not using any invalid indicesdi[n] += di[n1)] * s.coeff(1);
di[n] += di[i1] * s.coeff(n  (i1));
di[i] += di[i1] * s.coeff(1);
di[i] += di[i1] * s.coeff(1);
The two expressions are identical for the upper limit of
j
because k
can only equal one.Hopefully that's made everything clear because I can't think of any way of making it clearer!
Implementing \(\mathrm{inv}\) in terms of
chain
is relatively straightforward as illustrated by listing 16.Listing 16: ak.surreal Reciprocal
function inv(s) { var fx = 1/s.coeff(0); var fi = fx; var c = []; var n = s.order()+1; var i; c.length = n; c[0] = fx; for(i=1;i<n;++i) { fi *= fx; c[i] = fi; } return chain(c, s); }
So finally we can implement division by surreals, as shown in listing 17.
Listing 17: ak.surreal Division
function div(s0, s1) { return mul(s0, inv(s1)); } function divRS(r, s) { return mulRS(r, inv(s)); } ak.overload(ak.div, [ak.SURREAL_T, ak.SURREAL_T], div); ak.overload(ak.div, [ak.NUMBER_T, ak.SURREAL_T], divRS);
General Arithmetic Operations
Reciprocation shines a light on how we shall go about implementing other operations. As examples we shall use
\[
e^w\\
\ln w
\]
I shan't bother with the powers of surreals since, as we have already seen^{[5]}, they can be implemented in terms of these operations.The first step is, of course, to work out the Taylor series expansions of these functions. Fortunately this isn't very difficult; the Taylor series of the exponential is given by
\[
\begin{align*}
e^{a+\delta} &= e^a + e^a \times \delta + \frac{1}{2} \times e^a \times \delta^2 + \frac{1}{6} \times e^a \times \delta^3 + \dots\\
&= e^a \times \sum_{i \geqslant 0} \frac{1}{i!} \times \delta^i
\end{align*}
\]
and that of the logarithm by
\[
\begin{align*}
\ln(a+\delta) &= \ln a + \frac{1}{a} \times \delta  \frac{1}{2} \times \frac{1}{a^2} \times \delta^2 + \frac{1}{6} \times \frac{2}{a^3} \times \delta^3 + \dots\\
&= \ln a  \sum_{i \geqslant 1} \frac{(1)^i}{i \times a^i} \times \delta^i
\end{align*}
\]
With these formulae to hand, implementing the exponential and logarithmic functions is but a simple matter of programming, as shown in listing 18.Listing 18: ak.surreal Exponent And Logarithm
function exp(s) { var fx = Math.exp(s.coeff(0)); var c = []; var n = s.order()+1; var f = 1; var i; c.length = n; c[0] = fx; for(i=1;i<n;++i) { f *= i; c[i] = fx / f; } return chain(c, s); } function log(s) { var x = s.coeff(0); var fx = 1/x; var fi = 1; var c = []; var n = s.order()+1; var i; c.length = n; c[0] = Math.log(x); for(i=1;i<n;++i) { fi *= fx; c[i] = fi / i; } return chain(c, s); } ak.overload(ak.exp, ak.SURREAL_T, exp); ak.overload(ak.log, ak.SURREAL_T, log);
Unfortunately not all arithmetic operations have such simple Taylor series expansions. For example, consider the inverse of the cosine, \(\arccos(x)\). Its first derivative is
\[
\frac{\mathrm{d}}{\mathrm{d}x} \arccos(x) =  \frac{1}{\sqrt{1x^2}}
\]
If you were to differentiate this some several times in pursuit of a pattern for the Taylor series coefficients, and I wouldn't recommend that you try because it's an immensely tedious exercise, you'd be out of luck. Well, that is to say I couldn't spot one so I gave up and cheated; I used the symbolicDerivative
class^{[5]} to generate them for me, as shown in listing 19.Listing 19: ak.surreal Inverse Cosine
var symbolicDerivsCache = {}; function symbolicCoeffs(f, s) { var coeffs = []; var cache = symbolicDerivsCache[f] var fac = 1; var n = s.order()+1; var z, dfz, i; if(symbolicDerivsCache[f]) { cache = symbolicDerivsCache[f]; z = cache.z; dfz = cache.dfz; } else { z = ak.varExpr(); dfz = [ak[f](z)]; cache = {z: z, dfz: dfz}; symbolicDerivsCache[f] = cache; } z.value(ak.approxExpr(s.coeff(0))); coeffs.length = n; coeffs[0] = dfz[0].approx(); for(i=1;i<n;++i) { if(!dfz[i]) dfz[i] = ak.symbolicDerivative(dfz[i1], z); fac *= i; coeffs[i] = dfz[i].approx() / fac; } return coeffs; } function acos(s) { return chain(symbolicCoeffs('acos', s), s); } ak.overload(ak.acos, ak.SURREAL_T, acos);
The bulk of the
symbolicCoeffs
function involves maintaining a cache of ak
expression objects representing derivatives of increasing orders of ak
arithmetic operations, much as we did for factorials with fact
and for much the same reason. Fortunately for the sake of memory usage there aren't that many functions for which we shall need to rely upon symbolic differentiation for Taylor series coefficients, as you will see if you take a look at Surreal.js where you will also find the definitions of the rest of the ak.surreal
arithmetic operations, enumerated in listing 20.Listing 20: ak.surreal Overloads
ak.overload(ak.abs, ak.SURREAL_T, abs); ak.overload(ak.acos, ak.SURREAL_T, acos); ak.overload(ak.asin, ak.SURREAL_T, asin); ak.overload(ak.atan, ak.SURREAL_T, atan); ak.overload(ak.cos, ak.SURREAL_T, cos); ak.overload(ak.cosh, ak.SURREAL_T, cosh); ak.overload(ak.exp, ak.SURREAL_T, exp); ak.overload(ak.inv, ak.SURREAL_T, inv); ak.overload(ak.log, ak.SURREAL_T, log); ak.overload(ak.neg, ak.SURREAL_T, neg); ak.overload(ak.sin, ak.SURREAL_T, sin); ak.overload(ak.sinh, ak.SURREAL_T, sinh); ak.overload(ak.sqrt, ak.SURREAL_T, sqrt); ak.overload(ak.tan, ak.SURREAL_T, tan); ak.overload(ak.tanh, ak.SURREAL_T, tanh); ak.overload(ak.add, [ak.SURREAL_T, ak.SURREAL_T], add); ak.overload(ak.add, [ak.SURREAL_T, ak.NUMBER_T], addSR); ak.overload(ak.add, [ak.NUMBER_T, ak.SURREAL_T], addRS); ak.overload(ak.cmp, [ak.SURREAL_T, ak.SURREAL_T], cmp); ak.overload(ak.cmp, [ak.SURREAL_T, ak.NUMBER_T], cmpSR); ak.overload(ak.cmp, [ak.NUMBER_T, ak.SURREAL_T], cmpRS); ak.overload(ak.dist, [ak.SURREAL_T, ak.SURREAL_T], dist); ak.overload(ak.dist, [ak.SURREAL_T, ak.NUMBER_T], distSR); ak.overload(ak.dist, [ak.NUMBER_T, ak.SURREAL_T], distRS); ak.overload(ak.div, [ak.SURREAL_T, ak.SURREAL_T], div); ak.overload(ak.div, [ak.SURREAL_T, ak.NUMBER_T], divSR); ak.overload(ak.div, [ak.NUMBER_T, ak.SURREAL_T], divRS); ak.overload(ak.eq, [ak.SURREAL_T, ak.SURREAL_T], eq); ak.overload(ak.eq, [ak.SURREAL_T, ak.NUMBER_T], eqSR); ak.overload(ak.eq, [ak.NUMBER_T, ak.SURREAL_T], eqRS); ak.overload(ak.ge, [ak.SURREAL_T, ak.SURREAL_T], ge); ak.overload(ak.ge, [ak.SURREAL_T, ak.NUMBER_T], geSR); ak.overload(ak.ge, [ak.NUMBER_T, ak.SURREAL_T], geRS); ak.overload(ak.gt, [ak.SURREAL_T, ak.SURREAL_T], gt); ak.overload(ak.gt, [ak.SURREAL_T, ak.NUMBER_T], gtSR); ak.overload(ak.gt, [ak.NUMBER_T, ak.SURREAL_T], gtRS); ak.overload(ak.le, [ak.SURREAL_T, ak.SURREAL_T], le); ak.overload(ak.le, [ak.SURREAL_T, ak.NUMBER_T], leSR); ak.overload(ak.le, [ak.NUMBER_T, ak.SURREAL_T], leRS); ak.overload(ak.lt, [ak.SURREAL_T, ak.SURREAL_T], lt); ak.overload(ak.lt, [ak.SURREAL_T, ak.NUMBER_T], ltSR); ak.overload(ak.lt, [ak.NUMBER_T, ak.SURREAL_T], ltRS); ak.overload(ak.mul, [ak.SURREAL_T, ak.SURREAL_T], mul); ak.overload(ak.mul, [ak.SURREAL_T, ak.NUMBER_T], mulSR); ak.overload(ak.mul, [ak.NUMBER_T, ak.SURREAL_T], mulRS); ak.overload(ak.ne, [ak.SURREAL_T, ak.SURREAL_T], ne); ak.overload(ak.ne, [ak.SURREAL_T, ak.NUMBER_T], neSR); ak.overload(ak.ne, [ak.NUMBER_T, ak.SURREAL_T], neRS); ak.overload(ak.pow, [ak.SURREAL_T, ak.SURREAL_T], pow); ak.overload(ak.pow, [ak.SURREAL_T, ak.NUMBER_T], powSR); ak.overload(ak.pow, [ak.NUMBER_T, ak.SURREAL_T], powRS); ak.overload(ak.sub, [ak.SURREAL_T, ak.SURREAL_T], sub); ak.overload(ak.sub, [ak.SURREAL_T, ak.NUMBER_T], subSR); ak.overload(ak.sub, [ak.NUMBER_T, ak.SURREAL_T], subRS);
Surreal Derivative Accuracy
So just how accurate are derivatives calculated with surreals?To demonstrate, program 1 plots the difference between the surreal and symbolic first and second derivatives of an arbitrary function as red and blue lines respectively.
Program 1: Surreal Derivative Accuracy



That you're seeing a purple line at zero is because surreal differentiation is no less accurate than symbolic differentiation, although you might want to edit the function
f
to convince yourself that this example wasn't a fluke.
The Holy Grail?
So we finally have an algorithm for computer differentiation that is as accurate as possible without a computer algebra system and parsimonious with memory to boot.I must admit that I have something of a soft spot for this approach having independently discovered it a few years ago. Even though I subsequently discovered that I wasn't the first, I still find it to be more satisfying than symbolic differentiation and it is my weapon of choice for difficult, by which I mean extremely lengthy, differentiation calculations.
Cancellation Error
Unfortunately automatic differentiation suffers from cancellation error in the derivative just as much as symbolic differentiation does. Indeed, it is doubly cursed since, unlike symbolic differentiation, we couldn't automatically manipulate the form of the calculation to try to reduce its effect even if we wanted to, although we could at least again use interval arithmetic to monitor its effect.Numerical Approximation
As with symbolic differentiation we must still take care when combining automatic differentiation with numerical approximation; the derivative of an approximation of a function may not be a good approximation of the derivative of that function.The solution is more or less the same as that we used for expression objects; we must simply compute the derivatives explicitly and use them to calculate the coefficients of the function's Taylor series. As with expression objects we can do this either using a general purpose algorithm such as the polynomial approximation, or with a specific approximation of the derivative of the function, or, if we're very fortunate, with the actual formula for the derivative.
Clearly this is no better or no worse than taking the same approach with expression objects, but somehow it seems a more natural union to me. With automatic differentiation we are not trying to maintain an exact representation of the mathematical expression so resorting to approximation doesn't seem quite so jarring.
Although, as I have already admitted, I am a little biased.
L'Hopital's Rule
A neat application ofak.surreal
is to perform divisions that appear, on the face of it, impossible. For example, consider
\[
f(x) = \frac{\sin x}{x}
\]
What is its value when \(x\) equals zero? If we were to compute it with floating point arithmetic we'd get a result of NaN, but it should in fact be one.This rather surprising result can be demonstrated using, you guessed it, Taylor’s theorem!
The Taylor series expansions of the numerator and denominator are
\[
\begin{align*}
\sin (x+\delta) &= \sin x + \delta \times \cos x  \delta^2 \times \sin x  \delta^3 \times \cos x + \dots\\
x + \delta &= x + \delta
\end{align*}
\]
Setting \(x\) to zero yields
\[
\begin{align*}
f(0 + \delta) &= \frac{\sin 0 + \delta \times \cos 0  \delta^2 \times \sin 0  \delta^3 \times \cos 0 + \dots}{0 + \delta}\\
&= \frac{\delta  \delta^3 + \dots}{\delta}\\
&= 1  \delta^2 + \dots
\end{align*}
\]
This use of derivatives to evaluate ratios of functions that equate to zero is known as L'Hopital's rule. Note that if the first order terms of the numerator and denominator were also zero we could continue on to the second order terms in the same fashion.Since our surreal type keeps track of the coefficients of the Taylor series we can easily extend division to exploit L'Hopital's rule, allowing us to correctly evaluate such ratios. By way of an example, consider the ratio of two fourth order surreals whose first three coefficients are zero
\[
\frac{a_3 \delta^3 + a_4 \delta^4}{b_3 \delta^3 + b_4 \delta^4}
\]
Now we can simply divide the top and bottom of the fraction by \(\delta^3\) to get a welldefined result
\[
\frac{a_3 + a_4 \delta}{b_3 + b_4 \delta}
\]
We can do this by constructing a new pair of surreals whose nonzero terms are shifted to the left, as demonstrated by listing 21.Listing 21: ak.surrealDiv
ak.surrealDiv = function(s0, s1, d) { var a0, a1, i0, i1, n; if(ak.nativeType(d)===ak.UNDEFINED_T) d = 0; d = Math.abs(d); if(ak.type(s0)!==ak.SURREAL_T) { throw new Error('nonsurreal dividend in ak.surrealDiv'); } if(ak.type(s1)!==ak.SURREAL_T) { throw new Error('nonsurreal divisor in ak.surrealDiv'); } if(ak.nativeType(d)!==ak.NUMBER_T) { throw new Error('nonnumeric delta in ak.surrealDiv'); } if(s1.order()!==s0.order()) { throw new Error('order mismatch in ak.surrealDiv'); } n = s0.order(); i0 = 0; i1 = 0; while(i1<=n && Math.abs(s0.coeff(i1))<=d && Math.abs(s1.coeff(i1))<=d) ++i1; if(i1===0) return div(s0, s1); a0 = new Array(n+1); a1 = new Array(n+1); while(i1<=n) { a0[i0] = s0.coeff(i1); a1[i0] = s1.coeff(i1); ++i0; ++i1; } while(i0<=n) { a0[i0] = ak.NaN; a1[i0] = ak.NaN; ++i0; } return div(ak.surreal(a0), ak.surreal(a1)); }
The first loop searches for the first index
i1
at which one of the surreal arguments has a coefficient with a magnitude greater than the threshold d
. The following loops first copy higher order coefficients to lower order indices, the abovementioned left shift, and then pad empty coefficients with NaNs. The resulting surreals can then simply be passed to ak.div
which will return the ratio to the highest order possible, with incalculable coefficients represented by NaNs.Here, finally, is the reason why I chose to return NaNs rather than throw exceptions when reading past the end of the coefficients; L'Hopital's rule results in the loss of higher order coefficients and rather than change the order of the result and be forced to implement mixed order arithmetic it was far simpler to just return NaNs.
Note that whilst surreal arithmetic is still subject to cancellation error, with a careful choice of
d
, ak.surrealDiv
can at least mitigate against it for division.This then is my last word on numerical differentiation and I hope I have successfully demonstrated my point that to develop accurate numerical algorithms you will often need to think carefully about the underlying mathematics of the expressions that you're trying to approximate.
Or, in other words, that you're going to have to think!
References
[1] You're Going To Have To Think! Why [Insert Algorithm Here] Won't Cure Your Calculus Blues, www.thusspakeak.com, 2013.[2] You're Going To Have To Think! Why Finite Differences Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[3] You're Going To Have To Think! Why Polynomial Approximation Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[4] Ridders, C.J.F., Advances in Engineering Software, Vol. 4. No. 2., Elsevier, 1982.
[5] You're Going To Have To Think! Why Computer Algebra Won't Cure Your Calculus Blues, www.thusspakeak.com, 2014.
[6] You're Going To Have To Think! Why Interval Arithmetic Won't Cure Your Floating Point Blues, www.thusspakeak.com, 2013.
[7] Robinson, A. Nonstandard Analysis (New Ed.), Princeton University Press, 1996.
[8] Knuth, D. Surreal Numbers; How Two ExStudents Turned on to Pure Mathematics and Found Total Happiness, Addison Wesley, 1974.
[9] Climbing Mount Impractical, www.thusspakeak.com, 2013.
[10] You're Going To Have To Think! Why Arbitrary Precision Arithmetic Won't Cure Your Floating Point Blues, www.thusspakeak.com, 2013.
Based upon an article I wrote for ACCU's Overload magazine.
Leave a comment