Probabilistic Expression Recognition on Manifolds
Ya Chang, Changbo Hu, Matthew Turk
We present a probabilistic video-to-video facial expression recognition based on manifold of facial expression. The manifold of facial expression is embedded in high dimensional image space. In the embedded space, images with different expressions can be clustered and classified by the probabilistic model learned on the manifold of expression.
We propose the concept of Manifold of Facial Expression based on the observation that images of a subject’s facial expressions define a smooth manifold in the high dimensional image space, like Fig.2. Such a manifold representation can provide a unified framework for facial expression analysis. To learn the structure of the manifold in the image space, we investigated two types of embeddings from a high dimensional space to a low dimensional space: locally linear embedding (LLE) and Lipschitz embedding. To reduce the variation due to scaling, illumination condition, and face pose, we first apply Active Wavelets Networks on the image sequences for face registration and facial feature localization. The typical facial feature localization results of different expression are shown in Fig. 3.
Our experiments show that LLE is suitable for visualizing expression manifolds (Fig. 4). After applying Lipschitz embedding, the expression manifold can be approximately considered as a super-spherical surface in the embedding space (Fig. 1 and 5). The training image sequences will be represented as “paths” on the expression manifold. The likelihood of one kind of facial expression is modified as a mixture density with exemplars as mixture centers. After training the probabilistic model on manifold of facial expression, we can perform facial expression recognition and prediction on probe set with little limitation (not necessarily begin or end with neutral expression, the transition between different expressions may not pass through a neutral expression).
Fig. 2: From “Manifold way of Perception”, H. S. Seung, Science, 2000 Fig. 3: Some sample images with feature points Fig. 4: The first 2 coordinates of LLE, 478 images, with representing images
For manifolds derived from different subjects, we propose a nonlinear alignment algorithm that keeps the semantic similarity of facial expression from different subjects on one generalized manifold. We also show that nonlinear alignment outperforms linear alignment in expression classification.
Fig. 5: Lipshcitz embedding. 1824 images of female subject Fig. 6: Nonlinear alignment result of two manifolds. Circles are from manifold of the female subject,
filled points are from manifold of the male subject.
Y. Chang, C. Hu, and M. Turk