Faces are normalised so that the eyes are 60 pixels apart. This distance is selected to preserve detail but reduce computation. An area of 150 x 180 centered on the mid-point between the eyes is used for recognition. This area roughly minimises the amount of background included in the description.
A first derivative operator, with a diameter of four pixels, is used to derive an orientation image. This records the direction of the gradient vector at each pixel.
The recognition area is sub-divided into regions called windows. A grid of n by n windows is used, hence the windows are rectangular - Figure 1. Such a grid has a dimension n.
A histogram is produced for each window recording the orientation profile of the underlying sub-image. The orientation range is divided into 32 divisions. The tally for a division is incremented by one for each vector orientation falling within it. If the gradient magnitude for a pixel is zero, there is no directional information and no incrementation occurs. Thus a feature vector is produced for the face consisting of components.
A standard nearest-neighbour recognition method based on the feature vector is used to find the training face which best matches a test face. A test face is classified according to the training face to which it is closest, in terms of the least euclidean distance.
Suppose test face images are obscured to some extent. If a test image is compared to a training image, the 32 components for each of windows are employed in a sum of squares match. Consider the match on a window by window basis. If it is assumed that obscured windows in the test image are more likely to produce worse matches, omitting the poorer matching windows might enhance the recognition performance. The following technique is therefore explored: for each test to training image comparison, the matches for the windows are sorted in ascending order and the first added. A range of values of x is employed.