The USC Andrew and Erna Viterbi School of Engineering USC Signal and Image Processing Institute USC Ming Hsieh Department of Electrical Engineering University of Southern California

Technical Report USC-SIPI-419

“Biologically Inspired Overcomplete Representation, Feature Extraction and Object Classification”

by Pankaj Mishra

August 2011

A key to solving the multiclass object recognition problem is to extract a set of features which accurately and uniquely capture the salient characteristics of different objects. We show that complementary kinds of feature sets e.g., based on local, mid-level and global characteristics, can be combined to significantly improve recognition accuracy over that obtained using individual (or subcombination of) feature sets. First, we extract a set of local features based on a modified HMAX model, which is a hierarchical computational framework inspired by mammalian visual cortex. One of our modifications uses natural-stimuli adapted filters in place of Gabor filters. Overcomplete sets of basis functions based on sparseness maximization criteria have been reported to closely mimic the mammalian visual cortex, V1, in the sense that the resulting basis functions are typically localized, oriented, and bandpass, as are filters in V1. These overcomplete basis functions allow a smooth transition of coefficients and allow a high degree of specificity to image statistics. These natural-stimuli adapted filters are used with the HMAX model which increases its biological plausibility. The resulting features are largely scale, translation and rotation invariant. Second, we extract contextual information using modified Gist and spatial pyramid based features. Third, to capture larger contours and edges we extract features based on the Gestalt principle of continuity in visual perception. We combine these feature sets using confidence measures derived from discriminative model based posterior probabilities. Each posterior probability obtained in our case is based on support vector machine (SVM) decision boundaries, in part because SVMs have been shown to do well on large datasets. Different combinations of confidence measures are explored. Most significant improvements are gained using non-trainable fusion techniques. We demonstrate significant improvement for object recognition performance (over individual feature sets) using the publicly available Caltech-101 and 17-species Oxford Flowers datasets. The progressive addition of feature sets always resulted in performance improvement though the incremental gains varied.

To download the report in PDF format click here: USC-SIPI-419.pdf (1.1Mb)