“Image Segmentation by Clustering”
by Guy Barrett Coleman
July 1977
The Segmentation of imagery into homogeneous regions using digital techniques has been a goal of researchers for the past several years. Pattern recognition approaches using mathematical models have achieved results which are only partially satisfactory. The large dimension of the pattern space and the quantity of data involved in the digital representation of images are in part responsible for the limited applicability of these approaches. Other shortcomings are related to the demands for data with which to train the classifier.
Approaches based on linguistic models have also been tried, again with results which are partially satisfactory. The most serious shortcomings are related to the performance of these approaches in the presence of noise, a phenomenon with which man has learned to function effectively.
This dissertation describes a procedure for segmenting imagery using digital techniques and is based on the mathematical model. The classifier does not require training prototype, that is, it operates in an "unsupervised" mode. The procedure is general in that the features most useful for the particular image to be segmented are selected by the algorithm. The algorithm operates without any human interaction.
The features used are based on brightness and texture in regions centered on every picture element in the image. To perform an elementary pre-classification of local regions, a filter based on the mode of the local area histogram is proposed and used in segmenting images.
The basic procedure is a K-means clustering algorithm which converges to a local minimum in the average squared inter-cluster distance for a specified number of clusters. The algorithm iterates on the number of clusters, evaluating the clustering based on a parameter of clustering quality. The parameter proposed is a product of between and within cluster scatter measures, which achieves a maximum value that is postulated to represent an intrinsic number of clusters in the data.
It has been impossible in the past to compare different segmentations of the same image. A comparison measure based on the joint histogram of the two segmentations is proposed and examples of its use are presented.
It is within the state of the art to adapt the segmentation procedure described herein to operate in hardware at television rates. A functional diagram of such a system is presented, and estimates of the required capabilities are given.