You’re Going To Have To Think! Why Rational Arithmetic Won’t Cure Your Floating Point Blues


With fixed point arithmetic, arbitrary precision or otherwise, failing to solve the perceived problems of floating point[1][2], that is to say rounding errors, the question remains as to whether we can do any better. A tempting approach is to look to the rationals, rather than the integers, as a basis for our numeric calculations.
The rules of rational arithmetic are pretty straightforward. Given two rationals \(a_0 / b_0\) and \(a_1 / b_1\) we have
\[ \begin{align*} \frac{a_0}{b_0} + \frac{a_1}{b_1} &= \frac{a_0 \times b_1 + a_1 \times b_0}{b_0 \times b_1}\\ \frac{a_0}{b_0} - \frac{a_1}{b_1} &= \frac{a_0 \times b_1 - a_1 \times b_0}{b_0 \times b_1}\\ \frac{a_0}{b_0} \times \frac{a_1}{b_1} &= \frac{a_0 \times a_1}{b_0 \times b_1}\\ \frac{a_0}{b_0} \div \frac{a_1}{b_1} &= \frac{a_0 \times b_1}{b_0 \times a_1} \end{align*} \]
One enormous advantage of rational numbers is that, provided we do not overflow the integers representing the numerator (the top of the fraction) and the denominator (the bottom) the order of execution of these arithmetic operations is irrelevant; the answer will always be the same.

The only thing we need to watch out for is the fact that there are many ways of writing down the same number; 1/2, 2/4 and 3/6 all represent the same number, for example. We shall ensure that our representation is unique by insisting that the numerator and the denominator are the smallest numbers that yield the same rational, or equivalently have no common factors, and that the denominator is positive.
The latter condition is relatively straightforward to maintain. The former requires an algorithm to determine the highest common factor, or HCF, of a pair of numbers, the greatest positive integer that wholly divides both. We can subsequently divide out that factor and return any rational to its simplest form. Fortunately one such algorithm has been handed down to us from antiquity, courtesy of the great Euclid and it proceeds as follows.

Euclid's Algorithm

If the two numbers are equal, their value is the HCF.
If the smaller exactly divides the larger, the smaller is the HCF.
Otherwise, divide the larger by the smaller, and make note of the remainder. The HCF of the original numbers is equal to the HCF of the smaller number and the remainder.
In mathematical notation this can be expressed as
Derivation 1: Shared Common Factors
First, let us assume that \(x_0\) and \(x_1\) share a common factor of \(c\). We can therefore rewrite the equation as
\[ c x_1^\prime = a \times c x_0^\prime + b \]
for some \(x_0^\prime\) and \(x_1^\prime\).
Now since the left hand side is wholly divisible by \(c\) then so must the right hand side and furthermore since the first term on the right hand side is wholly divisible by \(c\) then so must be the second term, allowing us to express the equation as
\[ c x_1^\prime = a \times c x_0^\prime + c b^\prime \]
Second, let us assume that \(x_0\) and \(b\) share a different common factor of \(d\). We can now rewrite the equation as
\[ x_1^\prime = a \times d x_0^{\prime\prime} + d b^{\prime\prime} \]
for some \(x_0^{\prime\prime}\) and \(b^{\prime\prime}\).
But now the right hand side is wholly divisible by \(d\) and so therefore must be the left hand side.
Hence any factor shared by \(x_0\) and \(x_1\) must be shared by \(x_0\) and \(b\), and any factor shared by \(x_0\) and \(b\) must be shared by \(x_0\) and \(x_1\) and they must consequently have the same highest common factor.
\[ \begin{align*} \mathrm{if} \quad & x_1 = a \times x_0 + b \\ \mathrm{where} \quad & a > 0 \; \land \; 0 < b < x_0 \\ \mathrm{then} \quad & \mathrm{HCF}(x_1, x_0) = \mathrm{HCF}(x_0, b) \end{align*} \]
where \(\land\) means and.
Recursively applying these rules is guaranteed to terminate and we can thus determine the HCF.
For example, applying the Euclidean algorithm to 2163 and 1785 yields the following steps
\[ \begin{align*} 2,163 &= 1 \times 1785 + 378 \\ 1,785 &= 4 \times 378 + 273 \\ 378 &= 1 \times 273 + 105 \\ 273 &= 2 \times 105 + 63 \\ 105 &= 1 \times 63 + 42 \\ 63 &= 1 \times 42 + 21 \\ 42 &= 2 \times 21 \end{align*} \]
and hence the HCF of 2163 and 1785 is 21, a fact that is clear if we look at their prime factorisations.
\[ \begin{align*} 2,163 &= 3 \times 7 \times 103 \\ 1,785 &= 3 \times 5 \times 7 \times 17 \end{align*} \]
As it happens, this is simply a special case of the more general result that for any integers \(x_0\), \(x_1\), \(a\) and \(b\) where
\[ x_1 = a \times x_0 + b \]
then \(x_0\) and \(b\) must have the same HCF as \(x_0\) and \(x_1\), as shown in derivation 1.

As a consequence, it should not be surprising that the algorithm converges faster if we round the result of the division to the nearest integer rather than round down, consequently admitting negative remainders, and use the absolute value of the remainder in the following step.
Applying this optimisation to the same pair of numbers yields the same result in fewer steps.
\[ \begin{align*} 2,163 &= 1 \times 1,785 + 378 \\ 1,785 &= 5 \times 378 - 105 \\ 378 &= 4 \times 105 - 42 \\ 105 &= 2 \times 42 + 21 \\ 42 &= 2 \times 21 \end{align*} \]
An implementation of Euclid's algorithm is given in listing 1.

Listing 1: ak.hcf
ak.hcf = function(i, j) {
 var r;

 i = Math.abs(ak.trunc(i));
 j = Math.abs(ak.trunc(j));

 if(i<j) {r=i; i=j; j=r;}

 if(isNaN(i+j))   return ak.NaN;
 if(!isFinite(i)) return j;

 while(j>0) {
  r = i%j;
  i = j;
  j = (r+r<j) ? r : j-r;
 return i;

A Rational Class

Armed with Euclid's algorithm and the rules of rational arithmetic we're ready to implement a rational class, as illustrated in listing 2.

Listing 2: ak.rational
ak.RATIONAL_T = 'ak.rational';

function Rational(){}
Rational.prototype = {TYPE: ak.RATIONAL_T, valueOf: function(){return ak.NaN;}};

ak.rational = function() {
 var r     = new Rational();
 var arg0  = arguments[0];
 var state = {num: 0, den: 1};

 constructors[ak.nativeType(arg0)](state, arg0, arguments);

 r.num = function() {return state.num;};
 r.den = function() {return state.den;};

 r.toNumber = function() {return state.num/state.den;};
 r.toString = function() {return state.num.toFixed(0) +'/'+ state.den.toFixed(0);};

 return Object.freeze(r);

var constructors = {};

This class follows the same pattern as those we have already implemented, the state is initialised by dynamically dispatching calls to a constructors object based on the types of the arguments, then it's validated with a call to normalise, simple property accessors and type conversions are defined and the result is frozen.

ak.rational Constructors

The constructors are fairly simple, as shown in listing 3.

Listing 3: ak.rational Constructors
constructors[ak.NUMBER_T] = function(state, num, args) {
 var arg1 = args[1];
 state.num = ak.trunc(num);
 constructors[ak.NUMBER_T][ak.nativeType(arg1)](state, arg1);

constructors[ak.NUMBER_T][ak.NUMBER_T] = function(state, den) {
 state.den = ak.trunc(den);

constructors[ak.NUMBER_T][ak.UNDEFINED_T] = function() {

constructors[ak.OBJECT_T] = function(state, obj) {
 state.num = (ak.nativeType(obj.num)===ak.FUNCTION_T)
           ? ak.trunc(obj.num())
           : ak.trunc(obj.num);

 state.den = (ak.nativeType(obj.den)===ak.FUNCTION_T)
           ? ak.trunc(obj.den())
           : ak.trunc(obj.den);

The Number constructors expect a numerator followed by an optional denominator. The Object constructor allows initialisation with another ak.rational or with an initialiser object with fields names for the properties.

The implementation of normalise is shown in listing 4.

Listing 4: ak.rational normalise
function normalise(state) {
 var hcf;

 if(state.den<0) {
  state.num *= -1;
  state.den *= -1;

 if(Math.abs(state.num)>ak.INT_MAX) state.num *= ak.INFINITY;
 if(state.den>ak.INT_MAX)           state.den  = ak.INFINITY;

 if(isNaN(state.num/state.den)) {
  state.num = ak.NaN;
  state.den = ak.NaN;
 else if(state.den===0 || !isFinite(state.num)) {
  state.num /= state.den;
  state.den  = 1;
 else if(state.num===0 || !isFinite(state.den)) {
  state.num *= 0;
  state.den  = 1;
 else {
  hcf = ak.hcf(state.num, state.den);
  state.num /= hcf;
  state.den /= hcf;

Firstly, we force the denominator to be positive and overflow both the numerator and the denominator to the appropriately signed infinity if their absolute values exceed the maximum integer. Then, if either is NaN or both are infinite, all of which yield NaN from division, the state is set to NaN. Next we handle infinite and zero valued rationals, for which we set the denominator to one. Finally, we compute the HCF and divide both the numerator and denominator by it to guarantee that the rational is in its simplest form.

ak.rational Overloads

The methods are trivial, so I'll skip over them and get straight to the overloaded functions where, as before, the greater part of the functionality resides.
As usual abs and neg are simplest, as shown in listing 5.

Listing 5: ak.rational abs And neg
function abs(r) {
 return ak.rational(Math.abs(r.num()), r.den());
ak.overload(ak.abs, ak.RATIONAL_T, abs);

function neg(r) {
 return ak.rational(-r.num(), r.den());
ak.overload(ak.neg, ak.RATIONAL_T, neg);

Unlike most types, multiplication and division are equally straightforward as illustrated in listing 6.

Listing 6: ak.rational mul And div
function mul(r0, r1) {
 return ak.rational(r0.num()*r1.num(), r0.den()*r1.den());
ak.overload(ak.mul, [ak.RATIONAL_T, ak.RATIONAL_T], mul);

function div(r0, r1) {
 return ak.rational(r0.num()*r1.den(), r0.den()*r1.num());
ak.overload(ak.div, [ak.RATIONAL_T, ak.RATIONAL_T], div);

The rules for rational addition and subtraction are slightly more complicated than those for multiplication and division and so therefore are their implementations, given in listing 7.

Listing 7: ak.rational add And sub
function add(r0, r1) {
 var n0 = r0.num()*r1.den();
 var n1 = r0.den()*r1.num();

 if(Math.abs(n0)>ak.INT_MAX) n0 *= ak.INFINITY;
 if(Math.abs(n1)>ak.INT_MAX) n1 *= ak.INFINITY;

 return ak.rational(n0+n1, r0.den()*r1.den());
ak.overload(ak.add, [ak.RATIONAL_T, ak.RATIONAL_T], add);

function sub(r0, r1) {
 var n0 = r0.num()*r1.den();
 var n1 = r0.den()*r1.num();

 if(Math.abs(n0)>ak.INT_MAX) n0 *= ak.INFINITY;
 if(Math.abs(n1)>ak.INT_MAX) n1 *= ak.INFINITY;

 return ak.rational(n0-n1, r0.den()*r1.den());
ak.overload(ak.sub, [ak.RATIONAL_T, ak.RATIONAL_T], sub);

Unusually we have to deal with overflow explicitly here, rather than rely upon the result's constructor. This is because it's possible for an intermediate value to overflow and suffer rounding errors only to be drawn back into the realm of integers when combined with the other. There's not much point implementing a rational type if it can still suffer from rounding errors, so we much ensure that the intermediate values overflow to infinity so that such errors do not arise.

Equality and inequality comparisons are simple and exact, but relative comparisons suffer from the same sort of problems and must therefore take the same precautions, as illustrated by lising 8.

Listing 8: ak.rational Comparisons
function eq(r0, r1) {
 return r0.num()===r1.num() && r0.den()===r1.den();
ak.overload(ak.eq, [ak.RATIONAL_T, ak.RATIONAL_T], eq);

function cmp(r0, r1) {
 var lhs, rhs;

 if(r0.num()===r1.num() && r0.den()===r1.den()) return 0;

 lhs = r0.num()*r1.den();
 rhs = r0.den()*r1.num();

 if(Math.abs(lhs)>ak.INT_MAX) lhs *= ak.INFINITY;
 if(Math.abs(rhs)>ak.INT_MAX) rhs *= ak.INFINITY;

 return lhs - rhs;
ak.overload(ak.cmp, [ak.RATIONAL_T, ak.RATIONAL_T], cmp);

The unfortunate consequence of this is that it is sometimes not possible to determine if one rational value is, say, less than another. As annoying as this might be, I think that it's infinitely preferable to the possibility of silently failing to compare one valid rational value as, say, greater than another.

Listing 9 shows all of the ak.rational overloads.

Listing 9: ak.rational Overloads
ak.overload(ak.abs,  ak.RATIONAL_T, abs);
ak.overload(ak.inv,  ak.RATIONAL_T, inv);
ak.overload(ak.neg,  ak.RATIONAL_T, neg);

ak.overload(ak.add, [ak.RATIONAL_T, ak.RATIONAL_T], add);
ak.overload(ak.cmp, [ak.RATIONAL_T, ak.RATIONAL_T], cmp);
ak.overload(ak.div, [ak.RATIONAL_T, ak.RATIONAL_T], div);
ak.overload(ak.eq,  [ak.RATIONAL_T, ak.RATIONAL_T], eq);
ak.overload(,  [ak.RATIONAL_T, ak.RATIONAL_T], ge);
ak.overload(,  [ak.RATIONAL_T, ak.RATIONAL_T], gt);
ak.overload(ak.le,  [ak.RATIONAL_T, ak.RATIONAL_T], le);
ak.overload(,  [ak.RATIONAL_T, ak.RATIONAL_T], lt);
ak.overload(ak.mod, [ak.RATIONAL_T, ak.RATIONAL_T], mod);
ak.overload(ak.mul, [ak.RATIONAL_T, ak.RATIONAL_T], mul);
ak.overload(,  [ak.RATIONAL_T, ak.RATIONAL_T], ne);
ak.overload(ak.sub, [ak.RATIONAL_T, ak.RATIONAL_T], sub);

The full definition of ak.rational is provided in Rational.js

The Strengths Of Rationals

The great advantage of rational arithmetic is that, overflow aside, multiplication and division are exact, as illustrated by program 1.

Program 1: Rational Multiplication And Division

Here I use the RPN calculator ak.calc which forwards its arithmetic operations to our overloads.

The Weaknesses Of Rationals

As we have already had to deal with, rationals are particularly prone to overflow during addition and subtraction. If we are willing to accept a significant increase in the computational expense of the division heavy HCF calculation, this can be fixed by using bignums to represent the numerator and denominator.
Unfortunately, rationals suffer from a still greater problem, namely that not all numbers are rational.

Derivation 2: The Irrationality Of √2
Let us assume that the are integers \(a\) and \(b\) such that
\[ \frac{a}{b} = \sqrt{2} \]
and that we have cancelled all common factors so that their HCF is 1.
Trivially, we have
\[ \begin{align*} \frac{a^2}{b^2} &= 2 \\ a^2 &= 2 b^2 \end{align*} \]
Now any odd number multiplied by itself results in another odd number, so \(a\) must be even and hence equal to \(2a^\prime\) for some \(a^\prime\). Hence
\[ \begin{align*} (2a^\prime)^2 &= 2b^2 \\ b^2 &= 2 {a^\prime}^2 \end{align*} \]
But this similarly means that \(b\) must be even and that consequently \(a\) and \(b\) have a common factor of 2; a contradiction.
The square root of 2 cannot, therefore, be rational.
Perhaps the most famous irrational number is the square root of two, the proof of which's irrationality so annoyed the Pythagoreans that they reputedly drowned its discoverer.
That it cannot be exactly represented with a rational is shown in derivation 2.

We cannot therefore exactly represent any such number with our rational type. However, it is also true that for every irrational number there are an infinite number of rationals to be found within any given positive distance, no matter how small.
Perhaps we could represent an irrational with one of its rational neighbours?

Well, yes we could, but we’d have to decide exactly how distant that rational should be and, whatever distance we choose, we could match its accuracy with a floating point representation of sufficient precision.

So whilst, overflow aside, rational numbers are supremely accurate for addition, subtraction, multiplication and division and are consequently not sensitive to the order of execution of these operations, they require no less care and attention than floating point number types the instant we start mucking about with irrational numbers.

Unfortunately for rational arithmetic's candidacy as a replacement for floating point, irrational numbers turn up all the time in the kinds of problems we're interested in solving, even something as simple as solving quadratic equations. So, as neat as rational types undoubtedly are, they are by no means a floating point killer.


[1] You're Going To Have To Think! Why Fixed Point Arithmetic Won’t Cure Your Floating Point Blues,, 2013.

[2] You're Going To Have To Think! Why Arbitrary Precision Arithmetic Won’t Cure Your Floating Point Blues,, 2013.

Based upon an article I wrote for ACCU's Overload magazine.

Leave a comment