a.k. tag:www.thusspakeak.com,2013-04-25:/ak//9 2021-04-28T09:10:30Z Movable Type 5.2.13 The Middle Way tag:www.thusspakeak.com,2022:/ak//9.384 2022-05-06T19:00:00Z 2021-04-28T09:10:30Z A few years ago we spent some time implementing a number of the sorting, searching and set manipulation algorithms from the standard C++ library in JavaScript. Since the latter doesn't support the former's abstraction of container access via iterators we were compelled to restrict ourselves to using native Array objects following the conventions of its methods, such as slice and sort. In this post we shall take a look at an algorithm for finding the centrally ranked element, or median, of an array, which is strongly related to the ak.nthElement function, and then at a particular use for it. a.k. A few years ago we spent some time implementing a number of the sorting, searching and set manipulation algorithms from the standard C++ library in JavaScript. Since the latter doesn't support the former's abstraction of container access via iterators we were compelled to restrict ourselves to using native `Array` objects following the conventions of its methods, such as `slice` and `sort`.
In this post we shall take a look at an algorithm for finding the centrally ranked element, or median, of an array, which is strongly related to the `ak.nthElement` function, and then at a particular use for it. ]]>
Testing Our Students tag:www.thusspakeak.com,2022:/ak//9.383 2022-04-08T19:00:00Z 2022-04-08T18:52:19Z Last time we saw how we can use the chi-squared distribution to test whether a sample of values is consistent with pre-supposed expectations. A couple of months ago we took a look at Student's t-distribution which we can use to test whether a set of observations of a normally distributed random variable are consistent with its having a given mean when its variance is unknown. a.k. Last time we saw how we can use the chi-squared distribution to test whether a sample of values is consistent with pre-supposed expectations. A couple of months ago we took a look at Student's t-distribution which we can use to test whether a set of observations of a normally distributed random variable are consistent with its having a given mean when its variance is unknown. ]]> Time For A Chi Test tag:www.thusspakeak.com,2022:/ak//9.382 2022-03-04T20:00:00Z 2022-03-05T10:47:53Z A few months ago we explored the chi-squared distribution which describes the properties of sums of squares of standard normally distributed random variables, being those that have means of zero and standard deviations of one. Whilst I'm very much of the opinion that statistical distributions are worth describing in their own right, the chi-squared distribution plays a pivotal role in testing whether or not the categories into which a set of observations of some variable quantity fall are consistent with assumptions about the expected numbers in each category, which we shall take a look at in this post. a.k. A few months ago we explored the chi-squared distribution which describes the properties of sums of squares of standard normally distributed random variables, being those that have means of zero and standard deviations of one.
Whilst I'm very much of the opinion that statistical distributions are worth describing in their own right, the chi-squared distribution plays a pivotal role in testing whether or not the categories into which a set of observations of some variable quantity fall are consistent with assumptions about the expected numbers in each category, which we shall take a look at in this post. ]]>
A Jolly Student's Tea Party tag:www.thusspakeak.com,2022:/ak//9.381 2022-02-04T20:00:00Z 2022-02-04T23:33:26Z Last time we took a look at the chi-squared distribution which describes the behaviour of sums of squares of standard normally distributed random variables, having means of zero and standard deviations of one. Tangentially related is Student's t-distribution which governs the deviation of means of sets of independent observations of a normally distributed random variable from its known true mean, which we shall examine in this post. a.k. Last time we took a look at the chi-squared distribution which describes the behaviour of sums of squares of standard normally distributed random variables, having means of zero and standard deviations of one.
Tangentially related is Student's t-distribution which governs the deviation of means of sets of independent observations of a normally distributed random variable from its known true mean, which we shall examine in this post. ]]>
Chi Chi Again tag:www.thusspakeak.com,2022:/ak//9.380 2022-01-07T20:00:00Z 2022-01-08T17:21:02Z Several years ago we saw that, under some relatively easily met assumptions, the averages of independent observations of a random variable tend toward the normal distribution. Derived from that is the chi-squared distribution which describes the behaviour of sums of squares of independent standard normal random variables, having means of zero and standard deviations of one. In this post we shall see how it is related to the gamma distribution and implement its various functions in terms of those of the latter. a.k. Several years ago we saw that, under some relatively easily met assumptions, the averages of independent observations of a random variable tend toward the normal distribution. Derived from that is the chi-squared distribution which describes the behaviour of sums of squares of independent standard normal random variables, having means of zero and standard deviations of one.
In this post we shall see how it is related to the gamma distribution and implement its various functions in terms of those of the latter. ]]>
Adapt Or Try tag:www.thusspakeak.com,2021:/ak//9.367 2021-12-03T20:00:00Z 2021-12-03T19:56:34Z Over the last few months we have been looking at how we might approximate the solutions to ordinary differential equations, or ODEs, which define the derivative of one variable with respect to another with a function of them both. Firstly with the first order Euler method, then with the second order midpoint method and finally with the generalised Runge-Kutta method. Unfortunately all of these approaches require the step length to be fixed and specified in advance, ignoring any information that we might use to adjust it during the iteration in order to better trade off the efficiency and accuracy of the approximation. In this post we shall try to automatically modify the step lengths to yield an optimal, or at least reasonable, balance. a.k. Over the last few months we have been looking at how we might approximate the solutions to ordinary differential equations, or ODEs, which define the derivative of one variable with respect to another with a function of them both. Firstly with the first order Euler method, then with the second order midpoint method and finally with the generalised Runge-Kutta method.
Unfortunately all of these approaches require the step length to be fixed and specified in advance, ignoring any information that we might use to adjust it during the iteration in order to better trade off the efficiency and accuracy of the approximation. In this post we shall try to automatically modify the step lengths to yield an optimal, or at least reasonable, balance. ]]>
A Kutta Above The Rest tag:www.thusspakeak.com,2021:/ak//9.366 2021-11-05T19:00:00Z 2021-11-05T19:35:43Z We have recently been looking at ordinary differential equations, or ODEs, which relate the derivatives of one variable with respect to another to them with a function so that we cannot solve them with plain integration. Whilst there are a number of tricks for solving such equations if they have very specific forms, we typically have to resort to approximation algorithms such as the Euler method, with first order errors, and the midpoint method, with second order errors. In this post we shall see that these are both examples of a general class of algorithms that can be accurate to still greater orders of magnitude. a.k. We have recently been looking at ordinary differential equations, or ODEs, which relate the derivatives of one variable with respect to another to them with a function so that we cannot solve them with plain integration. Whilst there are a number of tricks for solving such equations if they have very specific forms, we typically have to resort to approximation algorithms such as the Euler method, with first order errors, and the midpoint method, with second order errors.
In this post we shall see that these are both examples of a general class of algorithms that can be accurate to still greater orders of magnitude. ]]>
Finding The Middle Ground tag:www.thusspakeak.com,2021:/ak//9.365 2021-10-01T19:00:00Z 2021-10-01T20:47:56Z Last time we saw how we can use Euler's method to approximate the solutions of ordinary differential equations, or ODEs, which define the derivative of one variable with respect to another as a function of them both, so that they cannot be solved by direct integration. Specifically, it uses Taylor's theorem to estimate the change in the first variable that results from a small step in the second, iteratively accumulating the results for steps of a constant length to approximate the value of the former at some particular value of the latter. Unfortunately it isn't very accurate, yielding an accumulated error proportional to the step length, and so this time we shall take a look at a way to improve it. a.k. Last time we saw how we can use Euler's method to approximate the solutions of ordinary differential equations, or ODEs, which define the derivative of one variable with respect to another as a function of them both, so that they cannot be solved by direct integration. Specifically, it uses Taylor's theorem to estimate the change in the first variable that results from a small step in the second, iteratively accumulating the results for steps of a constant length to approximate the value of the former at some particular value of the latter.
Unfortunately it isn't very accurate, yielding an accumulated error proportional to the step length, and so this time we shall take a look at a way to improve it. ]]>
Out Of The Ordinary tag:www.thusspakeak.com,2021:/ak//9.364 2021-09-03T19:00:00Z 2021-09-03T18:53:20Z Several years ago we saw how to use the trapezium rule to approximate integrals. This works by dividing the interval of integration into a set of equally spaced values, evaluating the function being integrated, or integrand, at each of them and calculating the area under the curve formed by connecting adjacent points with straight lines to form trapeziums. This was an improvement over an even more rudimentary scheme which instead placed rectangles spanning adjacent values with heights equal to the values of the function at their midpoints to approximate the area. Whilst there really wasn't much point in implementing this since it offers no advantage over the trapezium rule, it is a reasonable first approach to approximating the solutions to another type of problem involving calculus; ordinary differential equations, or ODEs. a.k. Several years ago we saw how to use the trapezium rule to approximate integrals. This works by dividing the interval of integration into a set of equally spaced values, evaluating the function being integrated, or integrand, at each of them and calculating the area under the curve formed by connecting adjacent points with straight lines to form trapeziums.
This was an improvement over an even more rudimentary scheme which instead placed rectangles spanning adjacent values with heights equal to the values of the function at their midpoints to approximate the area. Whilst there really wasn't much point in implementing this since it offers no advantage over the trapezium rule, it is a reasonable first approach to approximating the solutions to another type of problem involving calculus; ordinary differential equations, or ODEs. ]]>
Will They Blend? tag:www.thusspakeak.com,2021:/ak//9.363 2021-08-06T19:00:00Z 2021-08-06T19:31:55Z Last time we saw how we can create new random variables from sets of random variables with given probabilities of observation. To make an observation of such a random variable we randomly select one of its components, according to their probabilities, and make an observation of it. Furthermore, their associated probability density functions, or PDFs, cumulative distribution functions, or CDFs, and characteristic functions, or CFs, are simply sums of the component functions weighted by their probabilities of observation. Now there is nothing about such distributions, known as mixture distributions, that requires that the components are univariate. Given that copulas are simply multivariate distributions with standard uniformly distributed marginals, being the distributions of each element considered independently of the others, we can use the same technique to create new copulas too. a.k. Last time we saw how we can create new random variables from sets of random variables with given probabilities of observation. To make an observation of such a random variable we randomly select one of its components, according to their probabilities, and make an observation of it. Furthermore, their associated probability density functions, or PDFs, cumulative distribution functions, or CDFs, and characteristic functions, or CFs, are simply sums of the component functions weighted by their probabilities of observation.
Now there is nothing about such distributions, known as mixture distributions, that requires that the components are univariate. Given that copulas are simply multivariate distributions with standard uniformly distributed marginals, being the distributions of each element considered independently of the others, we can use the same technique to create new copulas too. ]]>
Mixing It Up tag:www.thusspakeak.com,2021:/ak//9.362 2021-07-02T19:00:00Z 2021-07-02T18:54:02Z Last year we took a look at basis function interpolation which fits a weighted sum of n independent functions, known as basis functions, through observations of an arbitrary function's values at a set of n points in order to approximate it at unobserved points. In particular, we saw that symmetric probability density functions, or PDFs, make reasonable basis functions for approximating both univariate and multivariate functions. It is quite tempting, therefore, to use weighted sums of PDFs to construct new PDFs and in this post we shall see how we can use a simple probabilistic argument to do so. a.k. Last year we took a look at basis function interpolation which fits a weighted sum of n independent functions, known as basis functions, through observations of an arbitrary function's values at a set of n points in order to approximate it at unobserved points. In particular, we saw that symmetric probability density functions, or PDFs, make reasonable basis functions for approximating both univariate and multivariate functions.
It is quite tempting, therefore, to use weighted sums of PDFs to construct new PDFs and in this post we shall see how we can use a simple probabilistic argument to do so. ]]>
A PR Exercise tag:www.thusspakeak.com,2021:/ak//9.361 2021-06-04T19:00:00Z 2021-06-05T08:03:45Z In the last few posts we've been looking at the BFGS quasi-Newton algorithm for minimising multivariate functions. This uses iteratively updated approximations of the Hessian matrix of second partial derivatives in order to choose directions in which to search for univariate minima, saving the expense of calculating it explicitly. A particularly useful property of the algorithm is that if the line search satisfies the Wolfe conditions then the positive definiteness of the Hessian is preserved, meaning that the implied locally quadratic approximation of the function must have a minimum. Unfortunately for large numbers of dimension the calculation of the approximation will still be relatively expensive and will require a significant amount of memory to store and so in this post we shall take a look at an algorithm that only uses the vector of first partial derivatives. a.k. In the last few posts we've been looking at the BFGS quasi-Newton algorithm for minimising multivariate functions. This uses iteratively updated approximations of the Hessian matrix of second partial derivatives in order to choose directions in which to search for univariate minima, saving the expense of calculating it explicitly. A particularly useful property of the algorithm is that if the line search satisfies the Wolfe conditions then the positive definiteness of the Hessian is preserved, meaning that the implied locally quadratic approximation of the function must have a minimum.
Unfortunately for large numbers of dimension the calculation of the approximation will still be relatively expensive and will require a significant amount of memory to store and so in this post we shall take a look at an algorithm that only uses the vector of first partial derivatives. ]]>
Bring Out The Big Flipping GunS tag:www.thusspakeak.com,2021:/ak//9.360 2021-05-07T19:00:00Z 2021-05-08T04:32:22Z Last month we took a look at quasi-Newton multivariate function minimisation algorithms which use approximations of the Hessian matrix of second partial derivatives to choose line search directions. We demonstrated that the BFGS rule for updating the Hessian after each line search maintains its positive definiteness if they conform to the Wolfe conditions, ensuring that the locally quadratic approximation of the function defined by its value, the vector of first partial derivatives and the Hessian has a minimum. Now that we've got the theoretical details out of the way it's time to get on with the implementation. a.k. Last month we took a look at quasi-Newton multivariate function minimisation algorithms which use approximations of the Hessian matrix of second partial derivatives to choose line search directions. We demonstrated that the BFGS rule for updating the Hessian after each line search maintains its positive definiteness if they conform to the Wolfe conditions, ensuring that the locally quadratic approximation of the function defined by its value, the vector of first partial derivatives and the Hessian has a minimum.
Now that we've got the theoretical details out of the way it's time to get on with the implementation. ]]>
Big Friendly GiantS tag:www.thusspakeak.com,2021:/ak//9.359 2021-04-02T19:00:00Z 2021-04-02T18:58:14Z In the previous post we saw how we could perform a univariate line search for a point that satisfies the Wolfe conditions meaning that it is reasonably close to a minimum and takes a lot less work to find than the minimum itself. Line searches are used in a class of multivariate minimisation algorithms which iteratively choose directions in which to proceed, in particular those that use approximations of the Hessian matrix of second partial derivatives of a function to do so, similarly to how the Levenberg-Marquardt multivariate inversion algorithm uses a diagonal matrix in place of the sum of the products of its Hessian matrices for each element and the error in that element's current value, and in this post we shall take a look at one of them. a.k. In the previous post we saw how we could perform a univariate line search for a point that satisfies the Wolfe conditions meaning that it is reasonably close to a minimum and takes a lot less work to find than the minimum itself. Line searches are used in a class of multivariate minimisation algorithms which iteratively choose directions in which to proceed, in particular those that use approximations of the Hessian matrix of second partial derivatives of a function to do so, similarly to how the Levenberg-Marquardt multivariate inversion algorithm uses a diagonal matrix in place of the sum of the products of its Hessian matrices for each element and the error in that element's current value, and in this post we shall take a look at one of them. ]]> Wolfe It Down tag:www.thusspakeak.com,2021:/ak//9.358 2021-03-05T20:00:00Z 2021-03-06T08:49:56Z Last time we saw how we could efficiently invert a vector valued multivariate function with the Levenberg-Marquardt algorithm which replaces the sum of its second derivatives with respect to each element in its result multiplied by the difference from those of its target value with a diagonal matrix. Similarly there are minimisation algorithms that use approximations of the Hessian matrix of second partial derivatives to estimate directions in which the value of the function will decrease. Before we take a look at them, however, we'll need a way to step toward minima in such directions, known as a line search, and in this post we shall see how we might reasonably do so. a.k. Last time we saw how we could efficiently invert a vector valued multivariate function with the Levenberg-Marquardt algorithm which replaces the sum of its second derivatives with respect to each element in its result multiplied by the difference from those of its target value with a diagonal matrix. Similarly there are minimisation algorithms that use approximations of the Hessian matrix of second partial derivatives to estimate directions in which the value of the function will decrease.
Before we take a look at them, however, we'll need a way to step toward minima in such directions, known as a line search, and in this post we shall see how we might reasonably do so. ]]>