If The Photo Fits...

| 0 Comments

Figure 1: Spooky Visions
In the last post[1] I describe how we might apply principal component analysis[2] to facial images[3] to create the rather spooky looking eigenfaces[4] that capture the most significant differences between them and their somewhat surprisingly attractive averages, as illustrated by figure 1 in which the eight most significant eigenfaces surround the averages of the male and female images.

We got as far as implementing helper functions to load bitmaps into ImageData objects and convert them to intensity matrices, deferring the implementation of the algorithm to actually generate eigenfaces, and a supposed application using them, to this post.

My Face, Your Face, All Our Faces From Eigenfaces

Recall that the eigensystem of a matrix \(\mathbf{M}\) is the collection of vectors \(\mathbf{v}_i\) and numbers \(\lambda_i\) satisfying
\[ \mathbf{M} \times \mathbf{v}_i = \lambda_i \times \mathbf{v}_i \]
and that a principal component analysis involves computing the eigensystem of the covariance matrix \(\mathbf{\Sigma}\) of a sample of \(n\) vectors \(\mathbf{w}_k\), having elements
\[ \mathbf{\Sigma}_{ij} = \frac{1}{n} \sum_{k=1}^n \left(\mathbf{w}_k - \mathbf{\bar{w}}\right)_i \times \left(\mathbf{w}_k - \mathbf{\bar{w}}\right)_j \]
where \(\sum\) is the summation sign and \(\mathbf{\bar{w}}\) is the mean of the sample.
If we define a matrix \(\mathbf{W}\) whose rows are equal to the vectors \(\mathbf{w}_k - \mathbf{\bar{w}}\) then we can define \(\mathbf{\Sigma}\) with
\[ \begin{align*} \mathbf{W}_{ki} &= \left(\mathbf{w}_k - \mathbf{\bar{w}}\right)_i\\ \mathbf{\Sigma} &= \frac{1}{n} \left(\mathbf{W}^\mathrm{T} \times \mathbf{W}\right) \end{align*} \]
where \(\mathbf{W}^\mathrm{T}\) is the transpose of \(\mathbf{W}\) in which its rows and columns are swapped.
To compute the eigenfaces we simply convert the images into vectors by copying their pixel intensities row-wise from the top-left to the bottom-right and perform a principal component analysis upon them. Now, since I chose to convert images to matrices rather than vectors with ak.imageDataToMatrix, we shall need to implicitly row-wise convert them to vectors during the construction of \(\mathbf{W}\)
\[ \mathbf{W}_{k \; (i \times c + j)} = \left(\mathbf{F}_k - \mathbf{\bar{F}}\right)_{ij}\\ \]
where \(\mathbf{F}_k\) is the intensity matrix of the \(k^\mathrm{th}\) face, each of which has \(c\) columns, and \(\mathbf{\bar{F}}\) is their mean.

Our implementation of the eigenface generation algorithm, ak.eigenfaces, is given in listing 1.

Listing 1: ak.eigenfaces
ak.EIGENFACES_T = 'ak.eigenfaces';

function Eigenfaces(){}
Eigenfaces.prototype = {TYPE: ak.EIGENFACES_T, valueOf: function(){return ak.NaN;}};

ak.eigenfaces = function() {
 var ef    = new Eigenfaces();
 var state = {m:undefined, f:[], w:[]};
 var arg0  = arguments[0];

 constructors[ak.nativeType(arg0)](state, arg0, arguments);

 ef.mean   = function()  {return state.m;};
 ef.faces  = function()  {return state.f.length;};
 ef.face   = function(i) {return state.f[Number(i)];};
 ef.weight = function(i) {return state.w[Number(i)];};

 ef.encode = function(face) {return encode(face, state.m, state.f);};
 ef.decode = function(code) {return decode(code, state.m, state.f);};

 return Object.freeze(ef);
};

var constructors = {};

This follows our typical scheme for classes; initialisation of the state is deferred to a constructors object and methods are provided for property access and encoding and decoding of face images, with the latter simply forwarding to free functions to do the actual work. Note that the faces property returns the number of eigenfaces, the face property returns the ith eigenface and the weight property a measure of its contribution to the sample as a whole.

Constructing ak.eigenfaces Objects

As you may have guessed from the names of the property access methods, the m and f members of the state object will contain the mean of the facial images and an array of their eigenfaces respectively. Given that the purpose of eigenfaces is to enable the approximation of facial images with a relatively small set of them, the principal constructor takes as its first argument the number of eigenfaces that we wish to use and the set of images from which we'll generate them, as illustrated in listing 2.

Listing 2: ak.eigenfaces Construction
constructors[ak.NUMBER_T] = function(state, n, args) {
 var arg1 = args[1];

 state.f.length = n;
 state.w.length = n;
 if(state.f.length<2) throw new Error('too few eigenfaces in ak.eigenfaces');
 constructors[ak.NUMBER_T][ak.nativeType(arg1)](state, arg1, args);
};

constructors[ak.NUMBER_T][ak.ARRAY_T] = function(state, faces, args) {
 var arg2 = args[2];

 if(faces.length<state.f.length) throw new Error('too few faces in ak.eigenfaces');
 constructors[ak.NUMBER_T][ak.ARRAY_T][ak.nativeType(arg2)](state, faces, arg2);
};

constructors[ak.NUMBER_T][ak.ARRAY_T][ak.UNDEFINED_T] = function(state, faces) {
 constructors[ak.NUMBER_T][ak.ARRAY_T][ak.NUMBER_T](state, faces, ak.EPSILON);
};

constructors[ak.NUMBER_T][ak.ARRAY_T][ak.NUMBER_T] = function(state, faces, eps) {
 var i, m, s, t, e, r, rows, cols;

 faces = faces.slice(0);
 for(i=0;i<faces.length;++i) {
  if(ak.type(faces[i])!==ak.MATRIX_T) faces[i] = ak.imageDataToMatrix(faces[i]);
 }

 m = sampleMean(faces);
 s = sampleMatrix(faces, m);
 t = ak.transpose(s);
 
 e = ak.jacobiDecomposition(ak.mul(s, t), eps);
 s = ak.transpose(ak.mul(t, e.v())).toArray();
 r = componentRank(e.lambda());

 rows = m.rows();
 cols = m.cols();

 state.m = m;
 for(i=0;i<state.f.length;++i) {
  m = ak.matrix(rows, cols, s[r[i]]);
  state.f[i] = ak.div(m, ak.abs(m));
  state.w[i] = Math.sqrt(Math.max(e.lambda().at(r[i]), 0));
 }
};

Note that we have an optional third argument for a numerical convergence threshold for the PCA that defaults to ak.EPSILON in constructors[ak.NUMBER_T][ak.ARRAY_T][ak.UNDEFINED_T].
Now clearly constructors[ak.NUMBER_T][ak.ARRAY_T][ak.NUMBER_T] is where the action is and so bears some explanation. The faces array is expected to contain intensity matrices and/or ImageData objects, so the first thing we do is convert any of its elements that are not matrices with ak.imageDataToMatrix. We then compute the sample mean with the helper function sampleMean and the matrix whose rows are the differences between their vector representations and its with sampleMatrix. Finally, we use our trick of transforming the eigenvectors \(\mathbf{v}_i^\prime\) of
\[ \mathbf{\Sigma}^\prime = \mathbf{W} \times \mathbf{W}^\mathrm{T} \]
into the eigenvectors \(\mathbf{v}_i\) of the actual covariance matrix
\[ \mathbf{\Sigma} = \mathbf{W}^\mathrm{T} \times \mathbf{W} \]
with
\[ \mathbf{v}_i = \frac{\mathbf{W}^\mathrm{T} \times \mathbf{v}_i^\prime}{\left|\mathbf{W}^\mathrm{T} \times \mathbf{v}_i^\prime\right|} \]
as described in the previous post. Here we do this by computing the eigensystem of \(\mathbf{\Sigma}^\prime\) with

e = ak.jacobiDecomposition(ak.mul(s, t), eps);

and the matrix whose columns are \(\mathbf{W}^\mathrm{T} \times \mathbf{v}_i^\prime\) with

ak.mul(t, e.v())

We then transpose this and convert it into an array, stored in s, whose elements are arrays equal to \(\mathbf{W}^\mathrm{T} \times \mathbf{v}_i^\prime\). The helper function componentRank gives the indices of largest to the smallest eigenvalue which we use to recover the most significant eigenvectors and convert them to intensity matrices in the final loop with

m = ak.matrix(rows, cols, s[r[i]]);
state.f[i] = ak.div(m, ak.abs(m));

Note that we normalise the eigenfaces by dividing the resulting intensity matrices by their absolute values rather than constructing them from normalised eigenvectors. The weights are set to the square roots of their associated eigenvalues which, you will recall, are the variances of the sample in their directions

state.w[i] = Math.sqrt(Math.max(e.lambda().at(r[i]), 0));

Now the variances should be non-negative, but just in case numerical error nudges them below zero we replace negative values with zero before taking the square root.

The helper functions are relatively straightforward, as illustrated by listing 3

Listing 3: sampleMean, sampleMatrix And componentRank
function sampleMean(faces) {
 var n = faces.length;
 var m = faces[0];
 var i;

 for(i=1;i<n;++i) m = ak.add(m, faces[i]);
 return ak.div(m, n);
}

function sampleMatrix(faces, mean) {
 var rows = mean.rows();
 var cols = mean.cols();
 var n = faces.length;
 var s = new Array(n*rows*cols);
 var off = 0;
 var i, face, r, c;

 for(i=0;i<n;++i) {
  face = faces[i];
  for(r=0;r<rows;++r) for(c=0;c<cols;++c) s[off++] = face.at(r, c)-mean.at(r, c);
 }
 return ak.matrix(n, rows*cols, s);
}

function componentRank(eigenvalues) {
 var l = eigenvalues.toArray();
 var n = l.length;
 var r = new Array(n);
 var i;

 for(i=0;i<n;++i) r[i] = i;
 r.sort(function(a,b){return l[b]-l[a];});
 return r;
}

The only slightly subtle point in these is that in sampleMatrix we calculate \(\mathbf{W}\) by filling the array s with its elements in row-wise order with

s[off++] = face.at(r, c)-mean.at(r, c);

and then using it to construct an ak.matrix with the correct dimensions.

Encoding And Decoding Faces

Recall that to encode a face into eigenfaces we simply transform it into a vector \(\mathbf{w}\) and calculate the product of its difference from the mean vector \(\mathbf{\bar{w}}\) with the eigenvectors \(\mathbf{v}_j\)
\[ y_j = \left(\mathbf{w} - \mathbf{\bar{w}}\right) \times \mathbf{v}_j \]
Once again, we shall compute these vector terms implicitly from the relevant intensity matrices, as illustrated in listing 4.

Listing 4: encode
function encode(face, m, f) {
 var rows = m.rows();
 var cols = m.cols();
 var n = f.length;
 var code = new Array(n);
 var i, eface, s, r, c;
   
 if(ak.type(face)!==ak.MATRIX_T) face = ak.imageDataToMatrix(face);
 face = ak.sub(face, m);

 for(i=0;i<n;++i) {
  eface = f[i];

  s = 0;
  for(r=0;r<rows;++r) for(c=0;c<cols;++c) s += face.at(r, c) * eface.at(r, c);
  code[i] = s;
 }
 return ak.vector(code);
}

As we did in the constructor, we assume that the face argument is either an intensity matrix or an ImageData object and convert the latter to the former. The implicit vector product is computed by iterating over the rows and columns of the matrices an adding up the products of their elements with

s += face.at(r, c) * eface.at(r, c);

Decoding an encoded face is even simpler; we simply multiply the intensity matrices of the eigenfaces by the elements in the code associated with them and add the results to the mean face's intensity matrix, as shown in listing 5.

Listing 5: decode
function decode(code, m, f) {
 var n = f.length;
 var i;

 if(ak.type(code)!==ak.VECTOR_T || code.dims()!==n) {
  throw new Error('invalid code in ak.eigenfaces.decode');
 }

 for(i=0;i<n;++i) m = ak.add(m, ak.mul(f[i], code.at(i)));
 return m;
}

Note that all of these functions can be found in Eigenfaces.js.

Eigenfaces In Action

I shall depart from my usual practice of demonstrating algorithms with programs that you can edit to get a feel for how they behave and instead present a graphical application that uses eigenfaces. Specifically, it is a rather nifty aid to facial recall that was proposed as an alternative to photofit for witnesses trying to construct facial images from memory[5].

Application 1: Photo Recall
The way it works is to randomly generate a set of eigenface codes with some fractions of their associated weights, both positive and negative, representing faces that are similar to the average. These are decoded into images, from which the witness selects the one that most closely resembles the face that they are trying to recall. This then replaces the average as a first approximation and a new set of codes that deviate from its code by slightly smaller random fractions of the weights are generated.
The process is repeated until a code that approximates the face in question is discovered. Note that this is essentially a blindfolded hill climber multivariate minimisation algorithm, such as we covered a few posts ago[6], albeit one in which the witness takes the role of the objective function.
Note that this isn't using eigenfaces to compress facial images but is rather using their associated weights to identify the best ways to modify them, for some definition of best.

An implementation is given in application 1, which you can start by clicking on either the male or female average face. After a short delay, during which the eigenfaces are being generated, you will be presented with eight new faces to choose from, with the average in the centre.
Figure 2: Westley And Buttercup
If one of the eight faces resembles the one you have in mind, click on it and it will be moved to the centre and surrounded with eight new faces that deviate from it by a slightly smaller amount. If not, click on the centre face and you'll get a new set to choose from that deviate from it by the same amount.

Unfortunately, you'll find that the faces quickly accumulate a fair bit of noise since the sample of facial images is both relatively small and crudely normalised. It does help a little if you squint, which you might try with my attempt at finding the faces of the principal characters of the greatest movie ever made, given in figure 2.

Now admittedly these aren't particularly accurate portraits but, given the deficiencies of the facial image sample noted above, I think that they're still a pretty damned good demonstration of how effective eigenfaces can be for constructing facial images!

References

[1] Who Are You Calling Average?, www.thusspakeak.com, 2014.

[2] The Principals Of The Things, www.thusspakeak.com, 2014.

[3] Kennedy, K. M., Hope, K. & Raz, N., Lifespan Adult Faces: Norms for Age, Familiarity, Memorability, Mood, and Picture Quality, Experimental Aging Research, Vol. 35, No. 2, pp. 268-275, 2009.

[4] Sirovich, L. & Kirby, M., Low-Dimensional Procedure for the Characterization of Human Faces, Journal of the Optical Society of America A, Vol. 4, No. 3, pp. 519-524, 1987.

[5] Rosenthal, Y., de Jager, G. & Greene, J., A Computerized Face Recall System Using Eigenfaces. Proceedings of the Eighth Annual South African Workshop on Pattern Recognition, pp. 53–57, Rhodes University, Grahamstown, 1997.

[6] It's All Downhill From Here, www.thusspakeak.com, 2014.

Leave a comment