This paper addresses the problem of clustering images of objects seen from different viewpoints. That is, given an unlabelled set of images of n objects, we seek an unsupervised algorithm that can group the images into n disjoint subsets such that each subset only contains images of a single object. We formulate this clustering problem under a very broad geometric framework. The theme is the interplay between the geometry of appearance manifolds and the symmetry of the 2D affine group. Specifically, we identify three important notions for image clustering: the L2 distance metric of the image space, the local linear structure of the appearance manifolds, and the action of the 2D affine group in the image space. Based on these notions, we propose a new image clustering algorithm. In a broad outline, the algorithm uses the metric to determine a neighborhood structure in the image space for each input image. Using local linear structure, comparisons (affinities) between images are computed only among the neighbors. These local comparisons are agglomerated into an affinity matrix, and a spectral clustering algorithm is used to yield the final clustering result. The technical part of the algorithm is to make all of these compatible with the action of the 2D affine group. Using human face images and images from the COIL database, we demonstrate experimentally that our algorithm is effective in clustering images (according to ojbect identity) where there is a large range of pose variation.
|Number of pages||13|
|Journal||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Publication status||Published - 2004|
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)