Over the last few months we have been looking at how we might approximate the solutions to ordinary differential equations, or ODEs, which define the derivative of one variable with respect to another with a function of them both. Firstly with the first order Euler method[1], then with the second order midpoint method[2] and finally with the generalised Runge-Kutta method[3].
Unfortunately all of these approaches require the step length to be fixed and specified in advance, ignoring any information that we might use to adjust it during the iteration in order to better trade off the efficiency and accuracy of the approximation. In this post we shall try to automatically modify the step lengths to yield an optimal, or at least reasonable, balance.

### A Brief Refresher

Recall that first order ODEs take the form
$y^\prime(x) = f(x, y(x))$
where $$y^\prime$$ is the derivative of $$y$$ with respect to $$x$$, and that higher order ODEs can be expressed as first order ODEs of vectors $$\mathbf{z}$$ containing $$y$$ and its derivatives
$z_i(x) = y^{(i)}(x)$
where $$y^{(i)}$$ is the $$i^\mathrm{th}$$ derivative of $$y$$ and $$y^{(0)}$$ equals $$y$$, using
$\mathbf{z}^\prime(x) = \mathbf{f}(x, \mathbf{z})$
where
$f_i\left(x, \mathbf{z}(x)\right) = \begin{cases} f\left(x, \mathbf{z}(x)\right) & i = n-1\\ z_{i+1}(x) & \text{otherwise} \end{cases}$
so that $$f$$ defines the $$n^\mathrm{th}$$ derivative of $$y$$ in terms of $$x$$, $$y$$ and its lower order derivatives.

The Euler method approximates the solutions to first order ODEs with
$y(x + \Delta x) = y(x) + \Delta x \times f(x, y(x)) + O\left(\Delta x^2\right)$
where $$O$$ represents an error term with a magnitude of the order of the term in the following brackets. The midpoint method does so with
$y(x + \Delta x) = y\left(x\right) + \Delta x \times f\left(x + \tfrac12 \times \Delta x, y(x) + \tfrac12 \times \Delta x \times f\left(x, y(x)\right)\right) + O\left(\Delta x^3\right)$
and the Runge-Kutta method with
\begin{align*} k_1 &= f(x, y(x))\\ k_2 &= f\left(x + \tfrac12 \times \Delta x, y(x) + \tfrac12 \times \Delta x \times k_1\right)\\ k_3 &= f\left(x + \tfrac12 \times \Delta x, y(x) + \tfrac12 \times \Delta x \times k_2\right)\\ k_4 &= f\left(x + \Delta x, y(x) + \Delta x \times k_3\right)\\ y(x + \Delta x) &= y(x) + \tfrac16 \times \Delta x \times (k_1 + 2 \times k_2 + 2 \times k_3 + k_4) + O\left(\Delta x^5\right) \end{align*}
These are all examples of the generalised Runge-Kutta method which approximates the solution of first order ODEs with equations of the form
\begin{align*} k_i &= \begin{cases} f(x, y(x)) & i=1\\ f\left(x + c_i \times \Delta x, y(x) + \Delta x \times \left(\sum_{j=1}^{i-1} a_{i,j} \times k_j\right)\right) & \text{otherwise} \end{cases}\\ y(x + \Delta x) &\approx \bar{y}(x + \Delta x) = y(x) + \Delta x \times \sum_{i=1}^n b_i \times k_i \end{align*}
whose coefficients $$a_{i,j}$$, weights $$b_i$$ and nodes $$c_i$$ are typically represented with a Butcher tableau[4]
$\begin{array}{c|ccccc} 0\\ c_2 & a_{2,1}\\ c_3 & a_{3,1} & a_{3,2}\\ \vdots & \vdots & \vdots & \ddots\\ c_n & a_{n,1} & a_{n,2} & \dots &a_{n,n-1}\\ \hline & b_1 & b_2 & \dots & b_{n-1} & b_n \end{array}$

### Doubling Down

Now, suppose that we could find a second set of weights $$\hat{b}_i$$ such that the approximation
$y(x + \Delta x) \approx \hat{y}(x + \Delta x) = y(x) + \Delta x \times \sum_{i=1}^n \hat{b}_i \times k_i$
has errors proportional to a greater power of $$\Delta x$$ than $$\bar{y}$$. Then we could estimate the error in the latter's approximation with
$\left|\bar{y}(x + \Delta x) - y(x + \Delta x)\right| \approx \left|\bar{y}(x + \Delta x) - \hat{y}(x + \Delta x)\right|$
where the vertical bars stand for the absolute value of the term between them.
We can represent such pairs of Runge-Kutta methods with an extended Butcher tableau
$\begin{array}{c|ccccc} 0\\ c_2 & a_{2,1}\\ c_3 & a_{3,1} & a_{3,2}\\ \vdots & \vdots & \vdots & \ddots\\ c_n & a_{n,1} & a_{n,2} & \dots &a_{n,n-1}\\ \hline & b_1 & b_2 & \dots & b_{n-1} & b_n\\ & \hat{b}_1 & \hat{b}_2 & \dots & \hat{b}_{n-1} & \hat{b}_n \end{array}$
In practice it's tempting to use the more accurate method to approximate the solution and the less accurate method to estimate its error, even though this means that the latter is technically incorrect. For example, the Bogacki-Shampine method given by
$\begin{array}{c|cccc} 0\\ \frac12 & \frac12\\ \frac34 & 0 & \frac34\\ 1 & \frac29 & \frac13 & \frac49\\ \hline & \frac29 & \frac13 & \frac49 & 0\\ & \frac{7}{24} & \frac14 & \frac13 & \frac18 \end{array}$
has third order errors for the approximation, given by the first row of weights, and second order errors for the error estimate, given by the second.

To demonstrate it we shall again use our example first order ODE defined by
$y^\prime(x) = -x \times y(x)$
which has the solution
$y(x) = e^{-\frac12 \times x^2}$
for $$y(0)$$ equal to one, as done by program 1 which plots the exact solution in black and the approximate solution, plus or minus a large multiple of the estimated error, in red.

Program 1: Our Example First Order ODE

It so happens that we're missing a trick here. Specifically, since the first row of weights is the same as the last row of coefficients and the last node is equal to one, the first $$k_i$$ in one step is equal to the last in the previous step, saving us an evaluation of $$f$$, as demonstrated by program 2.

Program 2: Reusing The Last Step

Rather than simply reporting the estimated errors in the approximation we can use them to adjust the step length $$\Delta x$$ by setting lower and upper bounds, $$\epsilon_0$$ and $$\epsilon_1$$, on the normalised error
$\delta = \frac{\left|\bar{y}(x + \Delta x) - \hat{y}(x + \Delta x)\right|}{1 + \left|y(x)\right|}$
If it falls outside of the bounds then we can update the step length with
\begin{align*} \rho &= \left(\frac{\epsilon_0}\delta \times \frac{\epsilon_1}\delta\right)^\dfrac{1}{2 \times o}\\ \Delta x &\leftarrow \rho \times \Delta x \end{align*}
where $$o$$ is the order of magnitude of the approximate solution, meaning that we reduce it if the error is too large and increase it if it's small enough that we'd accept greater errors for the sake of efficiency. Specifically, this means that the updated step length should hopefully yield an error that is of the order of the geometric mean of the error bounds, being the square root of their product, since
$\left(\rho \times \Delta x\right)^o = \left(\frac{\epsilon_0}\delta \times \frac{\epsilon_1}\delta\right)^\frac{1}{2} \times \Delta x^o = \left(\epsilon_0 \times \epsilon_1\right)^\frac{1}{2} \times \frac{\Delta x^o}\delta$
and $$\Delta x^o$$ is the magnitude of the error that we anticipated and $$\delta$$ is approximately the error that we achieved.

If a step yields an estimated error that is better than the lower bound then we might as well take it before updating the step length, whereas if it's worse than the upper bound then we should try again from the same starting point after doing so, making sure that we reset the last value of $$k_i$$, as done in program 3.

This clearly takes significantly fewer steps to estimate $$y\left(x_n\right)$$ than the Runge-Kutta method, although the graph doesn't make it especially clear how accurate it is. Program 4 therefore prints the final errors and number of steps for progressively doubled initial numbers of steps.

Program 4: Approximation Errors

Reassuringly this consistently converges upon an error that is within an order of magnitude of the bounds after more or less the same number of actual steps given the initially proposed numbers.

### Implementation Details

In general this approach is known as the Fehlberg method[5] and is implemented by ak.fehlbergODE in listing 1.

Listing 1: ak.fehlbergODE
ak.fehlbergODE = function(f, e0, e1, order, a, b0, b1, c, steps) {
var tc = ak.nativeType(c);
var ts = ak.nativeType(steps);
var s0, s1, dx, alb, k, cyclic;

if(tc===ak.NUMBER_T && ts===ak.UNDEFINED_T) {
steps = c;
ts = ak.NUMBER_T;
tc = ak.UNDEFINED_T;
}

e0 = Math.abs(e0);
e1 = Math.abs(e1);
steps = ts===ak.UNDEFINED_T ? 1000000 : ak.floor(Math.abs(steps));

checkArgTypes(f, e0, e1, order, a, b0, b1, c, steps, tc);
checkTableau(a, b0, b1, c, tc);

if(tc===ak.UNDEFINED_T) c = makeNodes(a);
s0 = b0.reduce(function(s,x){return s+x;}, 0);
if(!isFinite(s0) || s0===0) {
throw new Error('invalid total step weight in ak.fehlbergODE');
}
s1 = b1.reduce(function(s,x){return s+x;}, 0);
if(!isFinite(s1) || s1===0) {
throw new Error('invalid total error weight in ak.fehlbergODE');
}
dx = Math.pow(e0*e1, 1/(2*order));
if(dx===0) throw new Error('invalid step length in ak.fehlbergODE');

alb = makeLB(a);
a = a.map(function(r){return r.slice(0);});
b0 = b0.slice(0);
b1 = b1.slice(0);
if(tc!==ak.UNDEFINED_T) c = c.slice(0);
k = new Array(a.length+1);
cyclic = isCyclic(a, b0, c, s0);

return function(x0, x1, y0) {
var n;
if(ak.nativeType(x0)!==ak.NUMBER_T || !isFinite(x0)
|| ak.nativeType(x1)!==ak.NUMBER_T || !isFinite(x1)) {
throw new Error('invalid interval in ak.fehlbergODE');
}
n = ak.ceil(Math.abs(x1-x0)/dx);
if(n>ak.INT_MAX) throw new Error('too many steps in ak.fehlbergODE');
return ak.nativeType(y0)===ak.NUMBER_T
? numberFehlbergODE(f, n, x0, x1, y0, e0, e1, order, a, b0, b1, c,
s0, s1, steps, alb, k, cyclic)
: generalFehlbergODE(f, n, x0, x1, y0, e0, e1, order, a, b0, b1, c,
s0, s1, steps, alb, k, cyclic);
};
};


This follows much the same approach as our ak.rungeKuttaODE function, first checking that the arguments are valid before setting the nodes to the sums of the rows of coefficients if they're undefined and calculating the sums of the weights, which we shall use to normalise them by dividing the weighted sums of the approximations by them, and finally returning a function to make the approximation. It differs from it in that the lower and upper error bounds are specified rather than the step length, which is set to their geometric mean. It also requires the order of the approximation to be given and has an optional maximum number of steps, defaulting to one million, since the error bounds do not guarantee how many there will be.

Listing 2 gives the initial argument check, which ensures that f is a function, that the error bounds e0 and e1 are non-zero numbers, that the order is a strictly positive integer and that the elements of the Butcher tableau are arrays of the correct sizes.

Listing 2: Checking The Arguments
function checkArgTypes(f, e0, e1, order, a, b0, b1, c, steps, tc) {
if(ak.nativeType(f)!==ak.FUNCTION_T) {
throw new Error('invalid function in ak.fehlbergODE');
}
if(isNaN(e0) || e0===0) {
throw new Error('invalid lower error threshold in ak.fehlbergODE');
}
if(isNaN(e1) || e1<=e0) {
throw new Error('invalid upper error threshold in ak.fehlbergODE');
}
if(ak.nativeType(order)!==ak.NUMBER_T || order!==ak.floor(order) || order<1) {
throw new Error('invalid order in ak.fehlbergODE');
}
if(ak.nativeType(a)!==ak.ARRAY_T) {
throw new Error('invalid matrix in ak.fehlbergODE');
}
if(ak.nativeType(b0)!==ak.ARRAY_T) {
throw new Error('invalid step weights in ak.fehlbergODE');
}
if(b0.length!==a.length+1) {
throw new Error('matrix/step weights size mismatch in ak.fehlbergODE');
}
if(ak.nativeType(b1)!==ak.ARRAY_T) {
throw new Error('invalid error weights in ak.fehlbergODE');
}
if(b1.length!==a.length+1) {
throw new Error('matrix/error weights size mismatch in ak.fehlbergODE');
}

if(tc!==ak.UNDEFINED_T) {
if(tc!==ak.ARRAY_T) throw new Error('invalid nodes in ak.fehlbergODE');
if(c.length!==a.length) {
throw new Error('matrix/nodes size mismatch in ak.fehlbergODE');
}
}

if(isNaN(steps)) throw new Error('invalid maximum steps in ak.fehlbergODE');
}


Checking that the tableau is valid involves testing that the rows of the matrix have the correct number of elements, and that they, the weights and, if defined, the nodes are valid numbers, as shown by listing 3.

Listing 3: Checking The Tableau
function checkTableau(a, b0, b1, c, tc) {
var ai, i, j;

for(i=0;i<a.length;++i) {
ai = a[i];

if(ak.nativeType(b0[i])!==ak.NUMBER_T || !isFinite(b0[i])) {
throw new Error('invalid step weight in ak.fehlbergODE');
}
if(ak.nativeType(b1[i])!==ak.NUMBER_T || !isFinite(b1[i])) {
throw new Error('invalid error weight in ak.fehlbergODE');
}
if(ak.nativeType(ai)!==ak.ARRAY_T || ai.length!==i+1) {
throw new Error('invalid matrix row in ak.fehlbergODE');
}
for(j=0;j<=i;++j) {
if(ak.nativeType(ai[j])!==ak.NUMBER_T || !isFinite(ai[j])) {
throw new Error('invalid matrix element in ak.fehlbergODE');
}
}
if(tc!==ak.UNDEFINED_T && (ak.nativeType(c[i])!==ak.NUMBER_T
|| !isFinite(c[i]) || c[i]===0)) {
throw new Error('invalid node in ak.fehlbergODE');
}
}
if(ak.nativeType(b0[i])!==ak.NUMBER_T || !isFinite(b0[i])) {
throw new Error('invalid step weight in ak.fehlbergODE');
}
if(ak.nativeType(b1[i])!==ak.NUMBER_T || !isFinite(b1[i])) {
throw new Error('invalid error weight in ak.fehlbergODE');
}
}


Importantly, this does not check that the tableau defines a valid approximation algorithm since that would be extremely difficult. Instead we shall rely upon the user to supply appropriate values.

If the nodes weren't supplied then the makeNodes function defined in listing 4 sets them to the sums of the rows of the tableau's matrix.

Listing 4: Making The Nodes
function makeNodes(a) {
var c = new Array(a.length);
var i;

for(i=0;i<a.length;++i) {
c[i] = a[i].reduce(function(s,x){return s+x;}, 0);
if(!isFinite(c[i]) || c[i]===0) {
throw new Error('invalid node in ak.fehlbergODE');
}
}
return c;
}


Since Butcher tableaux frequently have leading zeros in their matrices, the makeLB function defined in listing 5 calculates the indices of the first non-zero elements in each of their rows so that we can skip past them during calculations.

Listing 5: The First Non-Zero Matrix Row Elements
function makeLB(a) {
var alb = new Array(a.length);
var ai, i, j;

for(i=0;i<a.length;++i) {
ai = a[i];

j = 0;
while(j<=i && ai[j]===0) ++j;
alb[i] = j;
}
return alb;
}


To determine whether the last value of $$k_i$$ in one step is equal to the first value in the next, the isCyclic function defined in listing 6 checks that last node is equal to one and that the first row of the weights is equal to the last row of coefficients, taking rounding error into account in both cases since the node may have been set to the sum of the coefficients and the weights are relative whilst the coefficients are not.

Listing 6: Checking For A Cycle
function isCyclic(a, b0, c, s0) {
var eps = (b0.length+1)*ak.EPSILON;
var ai = a[a.length-1];
var j = 0;
if(Math.abs(c[c.length-1]-1)>ai.length*ak.EPSILON) return false;
while(j<ai.length && Math.abs(b0[j]-ai[j]*s0)<=eps*Math.abs(b0[j])) ++j;
return j===ai.length && b0[j]===0;
}


Listing 7 implements the general Fehlberg method for scalar $$y$$, using the same step length adaptation as program 3 and exploiting the fact that the first $$k_i$$ is equal to the last in the previous step when possible.

Listing 7: The Scalar Implementation
function numberFehlbergODE(f, n, x0, x1, y0, e0, e1, order,
a, b0, b1, c, s0, s1, steps, alb, k, cyclic) {
var kn = k.length;
var dx = (x1-x0)/n;
var i = 0;
var j, l, dy0, dy1, aj, kj, d;

if(cyclic) k[kn-1] = f(x0, y0);

while(i<n) {
if(--steps<0) throw new Error('too many steps in ak.fehlbergODE');
k[0] = cyclic ? k[kn-1] : f(x0, y0);
dy0 = b0[0]*k[0];
dy1 = b1[0]*k[0];
for(j=1;j<kn;++j) {
aj = a[j-1];
l = alb[j-1];

kj = y0;
if(l!==j) {
kj += k[l]*aj[l]*dx;
while(++l<j) kj += k[l]*aj[l]*dx;
}
kj = f(x0+c[j-1]*dx, kj);
dy0 += b0[j]*kj;
dy1 += b1[j]*kj;
k[j] = kj;
}
dy0 *= dx/s0;
dy1 *= dx/s1;
d = Math.abs(dy0-dy1)/(1+Math.abs(y0));

if(d<=e1) {
x0 += dx;
y0 += dy0;
}
else if(cyclic) {
k[kn-1] = k[0];
}

if(d<e0 || d>e1) {
dx *= Math.pow((e0/d)*(e1/d), 1/(2*order));
n = ak.ceil(Math.abs(x1-x0)/dx);
if(n>ak.INT_MAX) throw new Error('too many steps in ak.fehlbergODE');
dx = (x1-x0)/n;
i = -1;
}

++i;
}
return y0;
}


Listing 8 extends this to arbitrary arithmetic types using our overloaded arithmetic operators allowing us, amongst other things, to use ak.vector objects to approximate the solutions of higher order ODEs.

Listing 8: The General Implementation
function generalFehlbergODE(f, n, x0, x1, y0, e0, e1, order,
a, b0, b1, c, s0, s1, steps, alb, k, cyclic) {
var kn = k.length;
var dx = (x1-x0)/n;
var i = 0;
var j, l, dy0, dy1, aj, kj, d;

if(cyclic) k[kn-1] = f(x0, y0);

while(i<n) {
if(--steps<0) throw new Error('too many steps in ak.fehlbergODE');
k[0] = cyclic ? k[kn-1] : f(x0, y0);
dy0 = ak.mul(b0[0], k[0]);
dy1 = ak.mul(b1[0], k[0]);
for(j=1;j<kn;++j) {
aj = a[j-1];
l = alb[j-1];

kj = y0;
if(l!==j) {
while(++l<j) if(aj[l]!==0) kj = ak.add(kj, ak.mul(k[l], aj[l]*dx));
}
kj = f(x0+c[j-1]*dx, kj);
if(b0[j]!==0) dy0 = ak.add(dy0, ak.mul(b0[j], kj));
if(b1[j]!==0) dy1 = ak.add(dy1, ak.mul(b1[j], kj));
k[j] = kj;
}
dy0 = ak.mul(dy0, dx/s0);
dy1 = ak.mul(dy1, dx/s1);
d = ak.dist(dy0, dy1)/(1+ak.abs(y0));

if(d<=e1) {
x0 += dx;
}
else if(cyclic) {
k[kn-1] = k[0];
}

if(d<e0 || d>e1) {
dx *= Math.pow((e0/d)*(e1/d), 1/(2*order));
n = ak.ceil(Math.abs(x1-x0)/dx);
if(n>ak.INT_MAX) throw new Error('too many steps in ak.fehlbergODE');
dx = (x1-x0)/n;
i = -1;
}

++i;
}
return y0;
}


As before we shan't really want to use ak.fehlbergODE directly but should instead prefer to use methods that are defined with Butcher tableaux that are known to be valid, such as those defined in listing 9.

Listing 9: Standard Runge-Kutta-Fehlberg Algorithms
ak.heunRKF2ODE = function(f, e0, e1, steps) {
return ak.fehlbergODE(f, e0, e1, 2, [[1]], [1,1], [1, 0], undefined, steps);
};

ak.bogackiShampineRKF3ODE = function(f, e0, e1, steps) {
return ak.fehlbergODE(f, e0, e1, 3,
[[1/2],[0,3/4],[2/9,1/3,4/9]],
[2,3,4,0], [7,6,8,3], undefined, steps);
};

ak.fehlbergRKF5ODE = function(f, e0, e1, steps) {
return ak.fehlbergODE(f, e0, e1, 5,
[[1/4],
[3/32,9/32],
[1932/2197,-7200/2197,7296/2197],
[439/216,-8,3680/513,-845/4104],
[-8/27,2,-3544/2565,1859/4104,-11/40]],
[33440,0,146432,142805,-50787,10260],
[2375,0,11264,10985,-4104,0],
undefined, steps);
};

ak.cashKarpRKF5ODE = function(f, e0, e1, steps) {
return ak.fehlbergODE(f, e0, e1, 5,
[[1/5],
[3/40,9/40],
[3/10,-9/10,6/5],
[-11/54,5/2,-70/27,35/27],
[1631/55296,175/512,575/13824,44275/110592,253/4096]],
[9361,0,38500,20125,0,27648],
[39550,0,148600,94675,7479,96768],
undefined, steps);
};

ak.dormandPrinceRKF5ODE = function(f, e0, e1, steps) {
return ak.fehlbergODE(f, e0, e1, 5,
[[1/5],
[3/40,9/40],
[44/45,-56/15,32/9],
[19372/6561,-25360/2187,64448/6561,-212/729],
[9017/3168,-355/33,46732/5247,49/176,-5103/18656],
[35/384,0,500/1113,125/192,-2187/6784,11/84]],
[12985,0,64000,92750,-45927,18656,0],
[1921409,0,9690880,13122270,-5802111,1902912,534240],
undefined, steps);
};


Of these, ak.dormandPrinceRKF5ODE is currently considered to be the preferred method since its tableau has been designed to minimise the errors in the higher order Runge-Kutta method so that we may be more confident in using it to approximate the solution of the ODE even though it should more appropriately be used to estimate the error.
Program 5 demonstrates its use for our example ODE, printing the normalised errors of $$y_i$$ for various values of $$x_i$$.

Program 5: Using ak.dormandPrinceRKF5ODE

Despite all of our hard work it should be noted that if the ODE is not sufficiently well behaved then Runge-Kutta and Fehlberg methods will fail to converge, at least in the forms that we have considered. We shall address this in future posts but until then you can find the implementations of these functions in FehlbergODE.js

### References

[1] Out Of The Ordinary, www.thusspakeak.com, 2021.

[2] Finding The Middle Ground, www.thusspakeak.com, 2021.

[3] A Kutta Above The Rest, www.thusspakeak.com, 2021.

[4] Butcher, J., Coefficients for the study of Runge-Kutta integration processes, Journal of the Australian Mathematical Society, Vol. 3, No. 2, 1963.

[5] Fehlberg, E., Low-order classical Runge-Kutta formulas with step size control and their application to some heat transfer problems, NASA Technical Report 315, 1969.