Climbing Mount Impractical

| 2 Comments

Of the myriad aspects of programming, numerical computing is perhaps the one that most programmers are least familiar with, which is a pity since it is, in my opinion, one of the most interesting.
In an attempt to increase understanding of the subject within the programming community I've decided to implement, and blog about implementing, a rudimentary numerical library.

In JavaScript.

Before you decide that I should be forcibly relocated to a padded cubicle I'd like to make it absolutely clear that JavaScript is not remotely my language of choice for such a project. No, my preferred language is C++; operator overloading and value types make using mathematical objects a breeze and templates and inlining help make them efficient. Hopefully only die hard Fortran programmers will still think that I'm insane.
However, I'm a firm believer that the best way to learn how algorithms work is to use them. If I were to implement my library in C++, reading about it would be laboriously interrupted with downloading and compiling code. Much better then for the code to be run directly in the browser and, for better or for worse, that means JavaScript.
As a complete newcomer to the language I have more or less followed the advice that Douglas Crockford gives in his book JavaScript: The Good Parts[1], but as a dyed-in-the-wool C++ programmer I'm reasonably certain that some of my design choices will be far from idiomatic.
One piece of advice that was not so hard to take to heart was to avoid polluting the global environment with objects and functions by instead creating a single global object and subsequently making everything a member of it; in deference to al-Khwarizmi I have named mine ak.

Given the subject matter, a certain amount of mathematics is inevitable and I'm fully aware that that sort of thing is likely to scare some folks off. I'll try to keep it as simple as possible and put the more difficult stuff in sidebars so that anybody who doesn't want to work through the nitty gritty details can simply skip them and take my word for it. I'd urge you not to though; for one thing knowing the maths behind an algorithm gives you a better understanding of it, and for another quidquid Latine dictum sit altum videtur as the saying goes.

Unfortunately there's a bunch of dull infrastructury stuff to cover before we can get on with the shiny numerical stuff. Starting with...

Asynchronous Loops

Efficiency is a significant issue in numerical computing; many algorithms are iterative in nature and can take quite some time to converge to a reasonable level of accuracy. Hanging the browser's rendering engine whilst we wait for results is unacceptable, so we'll need some way of yielding execution to it. Support for multithreading is somewhat patchy, but fortunately support for asynchronous execution is ubiquitous via the setTimeout function. With it we can instruct the browser to call a function after it has finished processing everything currently in its execution stack. It is but a hop, skip and a jump to roll it up into an asynchronous loop, as illustrated in listing 1.

Listing 1: ak.asynchLoop
ak.asynchLoop = function(body, end, err, ms) {
 try {
  if(body()) {
   setTimeout(function(){ak.asynchLoop(body, end, err, ms);}, ms>0?ms:0);
  }
  else if(end) {
   end();
  }
 }
 catch(e) {
  if(err) err(e);
  else    throw e;
 }
};

At first glance it doesn't look particularly loopy. I mean, there's no loop counter and it looks a lot more like recursion than iteration. In fact, the idea is that the body function should keep track of the state of the loop in addition to executing it, returning false when the loop should terminate. That it really is iterative is a consequence of the fact that setTimeout does not call the function argument immediately, but rather as soon after the millisecond delay given by its second argument as the browser is idle. And since the browser can't be idle whilst it's executing the rest of the code in which ak.asynchLoop was called, any code after the loop will run before it has terminated (unless it terminates immediately).
The purpose of the optional end argument is to allow the caller to specify code that must run after the loop terminates, with the optional err argument facilitating user defined error handling. Finally, the optional ms argument allows us to set a delay between each iteration of the loop, should we wish to.
The intent of the design is that the arguments should be closures capturing the variables containing the state and results from the caller, as illustrated in program 1.

Program 1: Using ak.asynchLoop

This also illustrates the asynchronous execution of the loop; the last statement in the client code is executed before the second iteration. And furthermore, it's a demonstration of the...

User Interface

The central concept in the ak user interface is the program; editable JavaScript code connected to one or more output objects. We have already seen the simplest of them, the console. This provides write and writeln methods that print their arguments as strings, concatenated if there are more than one, with the latter also adding a newline to the end.
The rest of the output objects represent various types of graph. They are all derived from a generic chart object which supports setting logical coordinates via a bounds method, drawing axes in logical coordinates via an axes method and conversion between logical (XY) and physical (CR) coordinates with toXY and toCR methods, coordinates being represented by JavaScript arrays. Oh, and it's implemented with the HTML5 canvas element, so if your browser isn't up to date you won't be able to run programs with graphical output.
Each graph implements a plot method for adding data, albeit each in a manner appropriate to its type. One commonality is that the plot and axes methods can be passed an optional final argument consisting of an object with members named for the graph properties that should differ from the defaults for the data being plotted. The basic properties, in no particular order, are
  • strokeStyle
  • fillStyle
  • lineWidth
  • pixelSize
  • tickSize
The last of these is specific to the axes method and, unsurprisingly, determines the size of the axis tick marks, the penultimate sets the pixel size for pixel-based graphs and the rest mean the same as they do for the underlying canvas element. Most of the graph types are affected by only a subset of these properties and some define further properties of their own. Note that property defaults can be set by calling a method with the same name as the property on the graph object.
It'd be pretty tedious to go into all of the functionality now, so we'll just take a quick peek at the line graph in program 2 and I'll demonstrate the rest of them as and when they're needed.

Program 2: Using the graph object

Hopefully, program 2 is simple enough that it needs no explanation, because I've no intention of giving one. Instead we shall move on to...

Overloading

I suspect that this is where my C++ roots are going to start showing. The thing is, I just can't do without overloading. Well, not completely anyway; having different implementations of a function for different types of arguments is just so damn useful.

To do this in JavaScript, the first thing we'll need is a way of distinguishing native types which I've provided with ak.nativeType as shown in listing 2.

Listing 2: ak.nativeType
ak.UNDEFINED_T = 'ak.undefined';

ak.nativeType = function(x) {
 return typeof x==='undefined' ? ak.UNDEFINED_T
                               : object.prototype.toString.call(x);
};

ak.ARRAY_T    = ak.nativeType([]);
ak.BOOLEAN_T  = ak.nativeType(false);
ak.FUNCTION_T = ak.nativeType(function(){});
ak.NUMBER_T   = ak.nativeType(0);
ak.OBJECT_T   = ak.nativeType({});
ak.STRING_T   = ak.nativeType('');

We can now trivially determine the native type of a value by comparing the result of passing it to ak.nativeType with the type constants created by passing the type exemplars to it, the special treatment of undefined values being required to cope with browser incompatibilities. A few native types are conspicuous by their absence but, quite frankly, I have no need of them.

We build upon this to distinguish between user defined types with ak.type, as shown in listing 3.

Listing 3: ak.type
ak.type = function(x) {
 var t, u;
 return (t=ak.nativeType(x))===ak.OBJECT_T
     && ak.nativeType(u=x.TYPE)===ak.STRING_T ? u : t;
};

Now this is a fairly terse bit of code, so it might not be immediately clear what it does. Rewriting it to use branching instead of short-circuiting the logical-and operator makes things a little clearer, as shown in listing 4.

Listing 4: A branching ak.type
ak.type = function(x) {
 var t = ak.nativeType(x);
 var u;

 if(t===ak.OBJECT_T) {
  u = x.TYPE;
  if(ak.nativeType(u)===ak.STRING_T) return u;
 }
 return t;
};

So if the argument is an object and has a member called TYPE that is a string, that string is returned. Otherwise the native type of the argument is returned. And no, I don't think that avoiding multiple exit points makes code any clearer. If anything I believe quite the opposite, so shut up.
Anyway, moving on, we shall rely upon objects honestly self-declaring their types with unique strings in order to shoe-horn a half-baked mechanism for identifying user defined types into JavaScript. Hardly perfect but, so long as we're careful, probably good enough.
You may think that this is rather redundant given that JavaScript object constructors can be used to identify types. However, identifying types with strings has one major advantage; strings can be used to index members of objects. By exploiting this we can implement a dynamic dispatch mechanism that will satisfy most of our overloading requirements.

The basic idea is that functions we wish to overload don't actually perform any computation, but rather forward their arguments to other functions based on the types of those arguments. Specifically, these functions will be members of the overloaded function (remember, in JavaScript functions are objects too).
For example
function unop(x) { return unop[ak.type(x)](x); }

and
function binop(x, y) { return binop[ak.type(x)][ak.type(y)](x, y); }

The downside of this approach is that we can only overload functions on the types of the arguments, not the number. The upsides are that it's extremely straightforward and relatively efficient.
Fortunately we shall be most interested in overloading arithmetic operations for different mathematical types, and they very rarely require variable numbers of arguments. The only time we're likely to need a more flexible dispatching mechanism is when constructing objects and, given the type-specific and unchanging nature of construction, it'll be a simple task to build the dispatch logic manually for each type.
Even with our simple dispatch mechanism, manually adding overloads to arithmetic operations would be an arduous process since there are a lot of them. Fortunately, it pretty easy to automate, as illustrated in listing 5.

Listing 5: ak.overload
ak.overload = function(f, t, o) {
 var n, i, s;

 if(ak.nativeType(t)===ak.ARRAY_T) {
  n = t.length - 1;
  for(i=0;i<n;++i) {
   s = t[i];
   if(ak.nativeType(s)!==ak.STRING_T) {
    throw new TypeError('invalid type name in ak.overload');
   }
   if(!f[s]) f[s] = {};
   f = f[s];
  }
  t = t[n];
 }

 if(ak.nativeType(t)!==ak.STRING_T) {
  throw new TypeError('invalid type name in ak.overload');
 }
 f[t] = o;
};

The arguments of ak.overload are expected to be the function to be overloaded, the type(s) of the argument(s) for which it should be overloaded and the function to which the call should be dispatched, respectively. If passed an array of type names it iterates over all but the last element, creating sub-objects with the names of the types if they don't already exist, replacing the dispatching function with the sub-object at each step and the type argument with its last argument after the final step. The overloaded function, or its relevant sub-object, has the target function assigned as a member with the name of the type, or of the last type.
Using ak.overload makes setting up the dispatch logic pretty easy, as shown in listing 6.

Listing 6: Using ak.overload
function add(x, y) {
 return x+y;
}

ak.abs = function(x) {
 return ak.abs[ak.type(x)](x);
};

ak.add = function(lhs, rhs) {
 return ak.add[ak.type(lhs)][ak.type(rhs)](lhs, rhs);
};

ak.overload(ak.abs, ak.NUMBER_T, Math.abs);
ak.overload(ak.add, [ak.NUMBER_T, ak.NUMBER_T], add);

Now at this point I think I need to provide some justification for this design since, on the face of it, all I seem to have achieved is slowing down addition with a couple of needless member lookups. I'll use linear algebra as a motivating example, with \(s\) representing a scalar (i.e. number), \(\mathbf{v}\) a vector and \(\mathbf{M}\) a matrix (if you're seeing some weird markup in this sentence, please turn on scripting or, failing that, learn to read \(\LaTeX\)). All three of these are subject to arithmetic manipulation which I'd like to be able to express with the library.
With the dispatch mechanism, we can easily add support for vector addition such as \(\mathbf{v} + \mathbf{v}\) to ak.add.
But, you might argue, surely it would be easier to implement an add method on my supposed vector type.
Well perhaps, but I'd also like to support multiplying vectors by scalars: e.g. \(s \times \mathbf{v}\). If I were to support this with methods it'd mean I'd have to change the implementation of Number to support post-multiplication by a vector type, which I'm loathe to do, for what I hope are obvious reasons.
But, you might further argue (showing an admirable tendency to not take my word for things, for which I congratulate you wholeheartedly), the result will be exactly the same as post-multiplying a vector with a scalar so why don't you just mandate that the scalar must go on the right hand side in such products?
Well I suppose I could, but I wouldn't be very happy about losing the choice to put it on either side. However, even if I were OK with it, I have a more compelling example. I'd like to support vector-matrix multiplication, both of the form \(\mathbf{M} \times \mathbf{v}\) and of the form \(\mathbf{v} \times \mathbf{M}\) which, in general, have different results.
There are a few different ways to deal with asymmetric arithmetic operations like this, so I'll quickly go over those I considered before settling on dynamic dispatch.

After implementing a matrix type I could go back to the vector type and add support for post-multiplication by matrices. Having to update existing source code files when adding new types is a splendid source of development and maintainance headaches, so I really don't want to do this.
Alternatively I could add a mulLeft method to the matrix type to provide support for left multiplication by vectors. I hope I don't have convince anyone that this is as ugly as sin and best avoided.
If I were to use JavaScript's prototypical class mechanism, I suppose I could add a method to the vector type's prototype in the file defining the matrix type. I suspect that most users would find this rather surprising though (I know I would) and suprises in code are a Bad Thing. I'll be using closures to represent objects so as to avoid public member data so this isn't really an option for me anyway.
Finally, I could continue to use external functions rather than methods, but name them to represent both the operation and the arguments, defining mulVectorMatrix, for example, in the matrix source code file. This approach is actually the most similar to C++ overloading, but only from the compiler's perspective since it's the compiler's responsibility to annotate function names with argument types. From the user's perspective it's not so good a match since it's the compiler's responsibility to annotate function names with argument types.
So after a careful and, I might add, completely objective analysis I have settled upon the approach that is most similar for the user to the one that I'm already comfortable with. Hopefully my justification has won you over, but if it hasn't, tough; it's my library and that's how it's going to work.
In any event using basic arithmetic functions as examples is a great lead-in to...

Constants and Functions

One of the first things I noticed about JavaScript is that many of the constants that are useful in numerical computing are nowhere to be found in its Math and Number objects. After adding them to my ak object it struck me that, rather than require users to remember in which of the three objects that any particular numerical constant resides, it'd be simpler to add all of those that I need to my own.
The full set of numeric constants defined in ak are
ak.DEC_DIG = 15; ak.DEC_MAX = Math.pow(10, ak.DEC_DIG)-1; ak.E = Math.E; ak.EPSILON = Math.pow(2, -52); ak.GAMMA = 5.7721566490153286e-1; ak.INFINITY = Number.POSITIVE_INFINITY; ak.INT_DIG = 53; ak.INT_MAX = Math.pow(2, ak.INT_DIG)-1; ak.MAX_VALUE = Number.MAX_VALUE; ak.MIN_NORMAL = Math.pow(2, -1022); ak.MIN_VALUE = Number.MIN_VALUE; ak.NaN = Number.NaN; ak.PHI = 0.5*(1+Math.sqrt(5)); ak.PI = Math.PI;

Those that are not simply synonyms for constants already defined by JavaScript objects are

ak.DEC_DIG     The number of decimal digits that can be exactly represented.
ak.DEC_MAX     The largest integer with ak.DEC_DIG decimal digits.
ak.EPSILON     The difference between one and the smallest representable number greater than one.
ak.GAMMA     Euler's constant.
ak.INT_DIG     The number of binary digits that can be exactly represented as integers.
ak.INT_MAX     The largest integer with ak.INT_DIG binary digits.
ak.MIN_NORMAL     The smallest number that has full double precision.
ak.PHI     The Golden Ratio.

At this point it isn't really necessary to know why these constants are useful, just that they are; we'll learn of their utility in due time.

Most of the functions in ak are there to set up the overloading logic for common arithmetic operations, but there a few that aren't and we'll cover those first.
The first three are ak.ceil (round up), ak.floor (round down) and ak.round (round to nearest) and are, more or less, synonyms for the same functions in Math. This may seem a bit pointless, but there are two good reasons for copying them into ak.
Firstly, at least one major browser has got Math.round wrong. This might seem suprising as the specification is pretty simple; round numbers ending in .5 up and all others to the nearest integer. So simple that we can implement it with a single line of code, as illustrated in listing 7.

Listing 7: Math.round
Math.round = function(x) {
 return Math.floor(x+0.5);
};

The specification pretty much dictates this implementation, right?

Wrong.

Don't beat yourself up if you can't see the error in this code as it's extremely subtle and you've almost certainly never been bitten by it. I'm willing to bet that the team behind the browser in which I noticed the problem haven't been bitten either and are using a functionally equivalent implementation.
The bug stems from the fact that JavaScript uses double precision floating point numbers, which have limited precision. This means that numbers above a certain magnitude cannot have any digits after the decimal point. For the smallest of such numbers, namely those having a units digit, adding 0.5 will cause the processor to perform a rounding operation before JavaScript sees the result. As a consequence, integers of this magnitude may be rounded to different integers which will then be unaffected by a subsequent call to floor. The ak.round function is a synonym for Math.round if it doesn't have this problem and is a correct implementation if it does.

The second reason is that there is one common rounding mode missing from JavaScript; truncation, or rounding towards zero. This is provided with the ak.trunc function, as shown in listing 8.

Listing 8: ak.trunc
ak.trunc = function(x) {
 return x>=0 ? ak.floor(x) : ak.ceil(x);
};

Having provided corrected and alternative rounding functions in ak it seemed reasonable to me to also provide synonyms for the remaining JavaScript rounding functions.

The core single argument overloaded functions in ak are

ak.abs(x)     Absolute value     \(|x|\)
ak.acos(x)     Inverse cosine     \(\arccos x\)
ak.asin(x)     Inverse sine     \(\arcsin x\)
ak.atan(x)     Inverse tangent     \(\arctan x\)
ak.cos(x)     Cosine     \(\cos x\)
ak.cosh(x)     Hyperbolic cosine     \(\cosh x\)
ak.exp(x)     Exponential     \(e^x\)
ak.inv(x)     Inverse     \(x^{-1}\)
ak.log(x)     Logarithm     \(\ln x\)
ak.neg(x)     Negation     \(-x\)
ak.sin(x)     Sine     \(\sin x\)
ak.sinh(x)     Hyperbolic sine     \(\sinh x\)
ak.tan(x)     Tangent     \(\tan x\)
ak.tanh(x)     Hyperbolic tangent     \(\tanh x\)

and the two argument functions are

ak.add(x, y)     Addition     \(x + y\)
ak.cmp(x, y)     Compare     \( \begin{cases} -a, &x < y\\ 0, &x = y\\ b, &x > y\\ \mathrm{NaN}, &\mathrm{incomparable} \end{cases} \)     for unspecified positive \(a\) and \(b\)
ak.dist(x, y)     Distance     \(|x - y|\)
ak.div(x, y)     Division     \(x \div y\)
ak.eq(x, y)     Equal to     \(x = y\)
ak.ge(x, y)     Greater than or equal to     \(x \geqslant y\)
ak.gt(x, y)     Greater than     \(x > y\)
ak.le(x, y)     Less than or equal to     \(x \leqslant y\)
ak.lt(x, y)     Less than     \(x < y\)
ak.mod(x, y)     Remainder     \(x \: \% \: y\)
ak.mul(x, y)     Multiplication     \(x \times y\)
ak.ne(x, y)     Not equal to     \(x \neq y\)
ak.pow(x, y)     Power     \(x^y\)
ak.sub(x, y)     Subtraction     \(x - y\)

Note that my use of the term negation to mean changing the sign of a number is technically incorrect since its definition is the opposite of a logical statement not an arithmetic one. However, I have heard it used this way before and I don't think that it'll lead to any confusion.

There are a few standard (and one completely non-standard) mathematical symbols in these definitions that could bear some explanation.
Vertical bars on either side of an expression stand for the absolute value of that expression. A left curly brace denotes a conditional expression, with each line giving a result followed by the condition that yields it. Finally, since there's no standard mathematical equivalent of the % operator, I've used the \(\%\) symbol to represent it.

The final function, ak.diff, is a normalised difference calculation, shown in listing 9.

Listing 9: ak.diff
ak.diff = function(lhs, rhs) {
 var num = ak.dist(lhs,rhs);
 var den = 1+Math.min(ak.abs(lhs),ak.abs(rhs));
 return num/den;
};

If the magnitudes of the arguments are both much larger than one then the result of ak.diff will be close to the relative distance between them. If either's is much smaller than one then it'll be close to the absolute distance between them. Why such behaviour is desirable we shall discuss next time, but this is a nice example of one of the advantages of the dynamic dispatch approach to overloading. Being implemented with ak overloaded functions, ak.diff does not need to be overloaded itself since it will work as it is for any type that has meaningful notions of distance and magnitude (assuming that the overloads in question return Number values, as will generally be the case).

Expressing arithmetic expressions with functions rather than operators is a rather tedious business, so ak.calc helpfully implements a reverse Polish notation, or RPN, calculator.
RPN, if you're not already familiar with it, puts operators after their arguments so
\[ x - y \]
is expressed as
\[ x \; y \; - \]
Reading an expression from left to right we immediately replace any operation of this form with its result, which entirely removes the need for parenthesis. For example
\[ (x + 1) \times (y - 1) \]
becomes
\[ x \; 1 \;+ \; y \; 1 \; - \times \]
Once you get used to it, RPN is a pretty neat way to represent arithmetic expressions. But to be brutally honest, that's not why I chose it. I chose it because it's far, far easier to implement than the familiar infix notation and, fundamentally, I'm a lazy git.

Now we can't very well pass actual operators as arguments to ak.calc so I've chosen to represent them with strings, as listed below.

'||'   →   ak.abs                  '+'   →   ak.add
'acos'   →   ak.acos                  'cmp'   →   ak.cmp
'asin'   →   ak.asin                  'diff'   →   ak.diff
'atan'   →   ak.atan                  'dist'   →   ak.dist
'cos'   →   ak.cos                  '/'   →   ak.div
'cosh'   →   ak.cosh                  '='   →   ak.eq
'e^'   →   ak.exp                  '>='   →   ak.ge
'1/'   →   ak.inv                  '>'   →   ak.gt
'log'   →   ak.log                  '<='   →   ak.le
'~'   →   ak.neg                  '<'   →   ak.lt
'sin'   →   ak.sin                  '%'   →   ak.mod
'sinh'   →   ak.sinh                  '*'   →   ak.mul
'sqrt'   →   ak.sqrt                  '!='   →   ak.ne
'tan'   →   ak.tan                  '^'   →   ak.pow
'tanh'   →   ak.tanh                  '-'   →   ak.sub

Note that negation is represented by a tilde ('~') rather than a dash ('-') since there's no way to disambiguate between passing one and two arguments to the latter and it is consequently used only for subtraction.
Direct use of the overloaded functions is compared to ak.calc in program 3.

Program 3: Using ak.calc

Which version is easier to read is probably just a matter of taste and I shall use both forms according to which I think simpler for any given expression.

The complete source for the core functionality can be found in AK.js.

With the infrastructure out of the way, we're ready to move on to the rather more interesting business of numerical computing, starting with the next post.

References

[1] Crockford, D. JavaScript: The Good Parts, O'Reilly, 2008.

2 Comments

Richard, Congratulations with the launch of this neat site!

Leave a comment