This paper presents a novel method to model and recognize human faces in video sequences. Each registered person is represented by a low-dimensional appearance manifold in the ambient image space. The complex nonlinear appearance manifold expressed as a collection of subsets (named pose manifolds), and the connectivity among them. Each pose manifold is approximated by an affine plane. To construct this representation, exemplars are sampled from videos, and these exemplars are clustered with a K-means algorithm; each cluster is represented as a plane computed through principal component analysis (PCA). The connectivity between the pose manifolds encodes the transition probability between images in each of the pose manifold and is learned from a training video sequences. A maximum a posteriori formulation is presented for face recognition in test video sequences by integrating the likelihood that the input image comes from a particular pose manifold and the transition probability to this pose manifold from the previous frame. To recognize faces with partial occlusion, we introduce a weight mask into the process. Extensive experiments demonstrate that the proposed algorithm outperforms existing frame-based face recognition methods with temporal voting schemes.
|Journal||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|Publication status||Published - 2003 Sep 1|
|Event||2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Madison, WI, United States|
Duration: 2003 Jun 18 → 2003 Jun 20
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition