This paper presents an algorithm for modeling, tracking, and recognizing human faces in video sequences within one integrated framework. Conventional video-based face recognition systems have usually been embodied with two independent components: the tracking and recognition modules. In contrast, our algorithm emphasizes an algorithmic architecture that tightly couples these two components within a single framework. This is accomplished through a novel appearance model which is utilized simultaneously by both modules, even with their disparate requirements and functions. The complex nonlinear appearance manifold of each registered person is partitioned into a collection of submanifolds where each models the face appearances of the person in nearby poses. The submanifold is approximated by a low-dimensional linear subspace computed by principal component analysis using images sampled from training video sequences. The connectivity between the submanifolds is modeled as transition probabilities between pairs of submanifolds, and these are learned directly from training video sequences. The integrated task of tracking and recognition is formulated as a maximum a posteriori estimation problem. Within our framework, the tracking and recognition modules are complementary to each other, and the capability and performance of one are enhanced by the other. Our approach contrasts sharply with more rigid conventional approaches in which these two modules work independently and in sequence. We report on a number of experiments and results that demonstrate the robustness, effectiveness, and stability of our algorithm.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition