# Testing Our Students

Last time[1] we saw how we can use the chi-squared distribution to test whether a sample of values is consistent with pre-supposed expectations. A couple of months ago[2] we took a look at Student's t-distribution which we can use to test whether a set of observations of a normally distributed random variable are consistent with its having a given mean when its variance is unknown.

### The One Sample Student's T-Test

Recall that Student's t-distribution with $$n-1$$ degrees of freedom describes the behaviour of the random variable
$t = \frac{\mu^\prime - \mu}{n^{-\frac12} \times s}$
where
\begin{align*} \mu^\prime &= \frac{1}{n} \times \sum_{i=0}^{n-1} Z_i\\ s^2 &= \frac{1}{n-1} \times \sum_{i=0}^{n-1} \left(Z_i-\mu^\prime\right)^2 \end{align*}
$$\sum$$ is the summation sign and the $$Z_i$$ are independent observations of a normally distributed random variable with a mean of $$\mu$$ and an unknown standard deviation of $$\sigma$$ for which $$s$$ is the best estimate that can be inferred from the sample.
This random variable represents the difference between the sample mean $$\mu^\prime$$ of a set of $$n$$ observations of the normal random variable and its true mean $$\mu$$. Given its cumulative distribution function, or CDF, $$F_t$$ we trivially have
\begin{align*} \Pr\left(t \leqslant \left|t^\prime\right|\right) &= F_t\left(\left|t^\prime\right|\right)\\ \Pr\left(t \leqslant -\left|t^\prime\right|\right) &= F_t\left(-\left|t^\prime\right|\right) \end{align*}
where $$t^\prime$$ is the difference for a specific set of observations $$Z_i^\prime$$, and consequently
\begin{align*} \Pr\left(|t| \leqslant \left|t^\prime\right|\right) &= F_t\left(\left|t^\prime\right|\right) - F_t\left(-\left|t^\prime\right|\right)\\ \Pr\left(|t| > \left|t^\prime\right|\right) &= 1 - \left(F_t\left(\left|t^\prime\right|\right) - F_t\left(-\left|t^\prime\right|\right)\right) \end{align*}
We may therefore use the latter as a measure of confidence that those observations are consistent with a mean of $$\mu$$. Note that by the central limit theorem we may also use it as an approximate measure for any distribution with finite and well-defined means and variances provided that $$n$$ isn't too small.

### Two Sample Student's T-Tests

Derivation 1: The Variance Of The Means' Difference
Firstly, note that the means and variances of sums of normally distributed variates accumulate additively.
Secondly, the terms
\begin{align*} &\left(n_1-1\right) \times s_1^2\\ &\left(n_2-1\right) \times s_2^2 \end{align*}
are the sums of the squared differences of the values in each sample from their sample means and so their sum is the total squared deviation. Since each sample mean accounts for one degree of freedom, the combined sample variance is recovered by dividing the sum by $$n_1+n_2-2$$.
Finally, the sample variances of the sample means are given by
\begin{align*} &n_1^{-1} \times s^2\\ &n_2^{-1} \times s^2 \end{align*}
and so the sample variance of their difference is the sum of these terms.
Once again we may exploit the central limit theorem to apply this result to non-normally distributed samples provided that the previously stated conditions are met.
If we have two samples containing $$n_1$$ and $$n_2$$ observations with sample means and variances of $$\mu_1$$, $$s^2_1$$, $$\mu_2$$ and $$s^2_2$$ respectively then, provided that their true variances are equal, we may test whether or not their true means are equal with the Student's t-statistic
$t = \frac{\mu_1^\prime - \mu_2^\prime}{\left(n_1^{-1} + n_2^{-1}\right)^\frac12 \times s}$
having $$n_1+n_2-2$$ degrees of freedom, where
$s^2 = \frac{\left(n_1-1\right) \times s_1^2 + \left(n_2-1\right) \times s_2^2}{n_1+n_2-2}$
as demonstrated by derivation 1.

A much simpler two sample test is whether or not the differences between the individual elements of two samples, distributed as the random variable
$X = X_1 - X_2$
have a given mean. To do so we can simply apply the one sample test to those differences, which will be exact if both $$X_1$$ and $$X_2$$ are normally distributed since their difference will also be normally distributed and approximate if they are not by the same appeal to the central limit theorem made above.
In particular, if we test against a mean of zero then we can determine whether or not the mean of a property of members of a population was significantly affected by some event that occurred between a first and second measurement.

### The Sample Variance

Noting that
\begin{align*} \sum_{i=0}^{n-1} \left(X_i - \mu^\prime\right)^2 &= \sum_{i=0}^{n-1} \left(X_i^2 - 2 \times X_i \times \mu^\prime + \mu^{\prime^2}\right)\\ &= \left(\sum_{i=0}^{n-1} X_i^2\right) - 2 \times n \times \mu^{\prime^2} + n \times \mu^{\prime^2} = \left(\sum_{i=0}^{n-1} X_i^2\right) - n \times \mu^{\prime^2} \end{align*}
we can represent the sample variance as
$s^2 = \frac{1}{n-1} \times \sum_{i=0}^{n-1} \left(X_i - \mu^\prime\right)^2 = \left(\frac{1}{n-1} \times \sum_{i=0}^{n-1} X_i^2\right) - \left(\frac{n}{n-1} \times \mu^{\prime^2}\right)$
Now it is entirely possible that rounding errors will result in a negative value for $$s^2$$ but only if the sample variance is extremely small in comparison to the square of the sample mean in which case it is not unreasonable to set it to zero, as is done by the meanDev function defined in listing 1.

Listing 1: The Sample Mean And Variance
function meanDev(sample, name) {
var n = sample.length;
var s1 = 0;
var s2 = 0;
var i, xi;

if(ak.nativeType(sample)!==ak.ARRAY_T) {
throw new Error('invalid samples in ak.'+name);
}
n = sample.length;
if(n<2) throw new Error('too few samples in ak.'+name);

for(i=0;i<n;++i) {
xi = sample[i];
if(ak.nativeType(xi)!==ak.NUMBER_T || !isFinite(xi)) {
throw new Error('invalid sample in ak.'+name);
}
s1 += xi;
s2 += xi*xi;
}
s1 /= n;
s2 /= n-1;
s2 -= s1*s1*n/(n-1);

return {m: s1, s2: Math.max(s2, 0)};
}


Listing 2 uses this to implement the one sample Student's t-statistic and test.

Listing 2: The One Sample T-Test
ak.studentsTStatOne = function(sample, mu) {
var ms;

if(ak.nativeType(mu)!==ak.NUMBER_T || !isFinite(mu)) {
throw new Error('invalid mu in ak.studentsTStatOne');
}

ms = meanDev(sample, 'studentsTStatOne');
return ms.m===mu ? 0 : Math.abs(ms.m-mu)*Math.sqrt(sample.length/ms.s2);
};

ak.studentsTTestOne = function(sample, mu) {
var s = ak.studentsTStatOne(sample, mu);
var t = ak.studentsTCDF(sample.length-1);
return 1-(t(s)-t(-s));
};


Program 1 demonstrates the application of the test to a sample of five normally distributed values, firstly using the correct mean and secondly using an incorrect mean.

Program 1: Using ak.studentsTTestOne

Note that since the number of samples is relatively small the test for the correct mean can easily be thrown off by an outlier.

Next, the two sample statistic and test are implemented in listing 3.

Listing 3: The Two Sample T-Test
ak.studentsTStatTwo = function(sample1, sample2) {
var ms1, ms2, n1, n2, s2;

ms1 = meanDev(sample1, 'studentsTStatTwo');
ms2 = meanDev(sample2, 'studentsTStatTwo');

if(ms1.s2>4*ms2.s2 || ms2.s2>4*ms1.s2) {
throw new Error('incompatible deviations in ak.studentsTStatTwo');
}
if(ms1.m===ms2.m) return 0;

n1 = sample1.length;
n2 = sample2.length;
s2 = (1/n1 + 1/n2)*((n1-1)*ms1.s2 + (n2-1)*ms2.s2)/(n1+n2-2);
return Math.abs(ms1.m-ms2.m)/Math.sqrt(s2);
};

ak.studentsTTestTwo = function(sample1, sample2) {
var s = ak.studentsTStatTwo(sample1, sample2);
var t = ak.studentsTCDF(sample1.length+sample2.length-2);
return 1-(t(s)-t(-s));
};


This test is demonstrated by program 2 which also first compares a pair of samples with the same mean and then a pair with different means.

Program 2: Using ak.studentsTTestTwo

Lastly, listing 4 gives the implementation of the paired statistic and test.

Listing 4: The Paired Sample T-Test
ak.studentsTStatPaired = function(sample1, sample2, mu) {
var n, i, x1i, x2i;

if(ak.nativeType(sample1)!==ak.ARRAY_T) {
throw new Error('invalid initial samples in ak.studentsTTestPaired');
}
if(ak.nativeType(sample2)!==ak.ARRAY_T) {
throw new Error('invalid final samples in ak.studentsTTestPaired');
}
n = sample1.length;

if(sample2.length!==n) {
throw new Error('sample size mismatch in ak.studentsTStatPaired');
}
sample2 = sample2.slice(0);

for(i=0;i<n;++i) {
x1i = sample1[i];
x2i = sample2[i];

if(ak.nativeType(x1i)!==ak.NUMBER_T || !isFinite(x1i)) {
throw new Error('invalid initial sample in ak.studentsTTestPaired');
}
if(ak.nativeType(x2i)!==ak.NUMBER_T || !isFinite(x2i)) {
throw new Error('invalid final sample in ak.studentsTTestPaired');
}
sample2[i] -= x1i;
}
return ak.studentsTStatOne(sample2, mu);
};

ak.studentsTTestPaired = function(sample1, sample2, mu) {
var s = ak.studentsTStatPaired(sample1, sample2, mu);
var t = ak.studentsTCDF(sample1.length-1);
return 1-(t(s)-t(-s));
};


Program 3 applies this test to a pair of samples with the same mean and then to a pair with different means.

Program 3: Using ak.studentsTTestPaired

Finally, all of these functions can be found in StudentsTTest.js.

### References

[1] Time For A Chi Test, www.thusspakeak.com, 2022.

[2] A Jolly Student's Tea Party, www.thusspakeak.com, 2022.