The USC Andrew and Erna Viterbi School of Engineering USC Signal and Image Processing Institute USC Ming Hsieh Department of Electrical and Computer Engineering University of Southern California

Technical Report USC-SIPI-178

“Hierarchical Representation, Matching, and Search for Some Computer Vision Problems”

by Vadlamannati Venkateswar

August 1991

Representation, matching, and search are central to the task of scene interpretation. Representation refers to how the scene and models are organized for subsequent interpretation tasks, like recognition. A hierarchical representation scheme for scenes and models allows for robust interpretation strategies that exploit the different levels of the hierarchy. In this scheme, features in the scene are composed into more complex objects based on physical and geometric properties and the models are similarly decomposed into simpler features. Such hierarchies are concisely represented using frames. For matching, we support a hierarchical strategy where features at the higher level of the hierarchy are matched first and those at the lowest level last. Higher level features are fewer in number and have more structure and so are easier to match. These matches constrain the matches at lower levels. The ambiguities involved in feature grouping and matching result in a search space. As an alternative to relaxation and tree search techniques, we advocate the use of a search scheme based on an Assumption based Truth Maintenance System (ATMS) for exploring this search space. The ATMS assists in simultaneous search in multiple contexts, enforces binary or higher order constraints, assists in symbolic uncertainty reasoning and carries out belief revisions necessitated by incremental additions, deletions and confirmations of feature and match hypotheses.

We design three applications based on these general principles. These include building detection, stereo matching, and motion correspondence. An appropriate representation for these tasks is a hierarchy consisting of lines, vertices, edges, and surfaces. In building detection, a dynamic search scheme based on an ATMS assists in the integration of bottom-up and top-down information in the search for roofs. In stereo matching, a hierarchical matching strategy used for motion correspondence results in both point and line matches. This framework can be applied to other applications, like object recognition from 2D or 3D data using 3D models, multi-sensor fusion, etc.

To download the report in PDF format click here: USC-SIPI-178.pdf (8.2Mb)