Face and hand tracking with skin color segmentation

Haiying Guan, Matthew Turk

Skin Color Segmentation


In this project, based on skin color segmentation results, face and hands are detected and tracked by clustering techniques.


Detecting and tracking face and hands are important for gesture recognition and human computer interaction. In this project, first, we model the color histogram of images by a Gaussian Mixture Model(GMM), then we implement the Restrict EM algorithm to obtain the model parameters for skin color. Based on Bayesian segmentation, the mean shift algorithm is adopted to track the face and hand locations. The project is also partial work for a collaborative project with MAT: Interface Device for Interactive Installation.

1.1 Skin Color Segmentation

1.1.1 Skin Color Modelling - Single Gaussian in H-S Space

Using the dataset of skin pixels and non-skin pixels of Cambridge Research Lab. (Compaq), we model skin color in HS Space.

This dataset has 80,306,243 skin pixels and 861,142,189 non-skin pixels. Instead of using the RGB color space, we convert them to HSV color space, then we project from V direction to HS space and obtain the HS histogram distribution of skin color.

From the graph, we conclude that the skin color distribution mainly clusters in a small area of the chromatic color space. In the experiments, we approximately model skin color by a single gaussian.

1.1.2 Color Histogram Modelling by Gaussian Mixture Model (GMM)

We model the color histogram of the input image by GMM.

1.1.3 Restrict EM algorithm for parameter estimation

Restrict EM algorithm is partly based on paper "Segmenting Hands of Arbitrary Color by Zhu, Yang and Waibel (FG2000)". The main idea is to fix the mean of the Gaussian (which model the skin color) during the EM training. Instead of using the prior model P(Hand|x,y), this project segments skin color without any assumption of hands locations, or the area size of skin region.

1.1.4 Segmenation Results

According to Bayesian rule, we segment skin-color from background. We test the segmentation results in different situations. The algorithm works well.

1.2 Face Localization by Mean-Shift Algorithm

We assume the face area is larger than each hand. After pixel-based segmentation and morphological operations, mean-shift algorithm is applied again to find the face area. After that all the pixels of skin color within the area of face are removed. The remaining pixels should contain skin pixels of two hands.

1.3 Hands clustering by K-means and tracking by Mean-Shift Algorithm

First, we cluster the remaining skin-color pixels into two clusters by k-means algorithm. Then, with mean-shift algorithm, we find the two hands' locations and track them. The results are shown below.

1.4 Interactive Installation

The project is also  partial work for a collaborative project with MAT: Interface Device for Interactive Installation. The user can animate a graphic cylinder (size, shape, rotation) by face and hands location.