“Perceptual Grouping and Segmentation Using Neural Networks”
by B.S. Manjunath
December 1991
Grouping and segmentation occur at all levels in the visual information processing hierarchy. We address here their relevance during the early stages in vision, using neural networks as the computing paradigm. Neural networks provide a fresh perspective in solving many of the early vision problems: they provide a parallel and distributed computing environment with the associated fault tolerance; a homogeneous architecture which might help in integrating different visual cues, and in incorporating active interactions between different processing stages; and their ability to learn and self-organize in a continuously changing environment. Besides, they help in bridging the gap between computer vision and human vision research.
We first investigate the use of neural networks from a parallel computing perspective. Most low level vision problems such as model based texture segmentation can be formulated in an optimization framework. In the context of texture segmentation, we show how models based on Markov random fields naturally map onto networks for optimization. We develop a stochastic learning system which combines the speed of deterministic relaxation algorithms with the sustained exploration of search space, a characteristic of stochastic algorithms. We discuss extensions to unsupervised segmentation and the advantages of keeping estimation of parameters and segmentation separate.
While model based approaches to segmentation incorporate knowledge about the textures to obtain fairly accurate results, they do not provide much intuition about human texture perception. The second half of this research is biologically motivated and we try to model some of the early processing stages in vision. We study the problem of pre-attentive texture perception in a more general framework of boundary detection, and develop a unified approach to detecting intensity as well as texture edges, in addition to illusory contours. We clearly demonstrate the role of end-inhibition, modeled by using local scale interactions, in texture boundary perception and in perceiving subjective contours. Visual illusions are a consequence of wired-in assumptions about the real world, which biological systems make full use of in order to process the enormous amount of visual data in real time. We suggest that the role of end-inhibited cells in the detection of illusory contours is not accidental. End-inhibition provides a robust model for feature detection and representation, and could be used in representing shape information. We demonstrate this in our development of a simple face recognition system, where object recognition is formulated as an inexact graph matching problem. Extensive experimental results are provided to illustrate the performance of the various algorithms.