Activity recognition using dynamic Bayesian networks


Activity recognition using dynamic Bayesian networks

Justin Muncaster, Yunqian Ma


We present a method to classify activities based on the trajectory of feature measurements. We propose a d-level hierarchically organized graphical model with roots in the hierarchical hidden markov model. We propose to start with the minimum constraints to enforce hierarchy and add dependencies to keep the number of model parameters small. We also propose to use deterministic annealing to partition the feature space and automatically discover low-level states in our model. We apply our algorithm to real world data to demonstrate its effectiveness. Our tests show qualitatively that our method can recognize activities based on features from noisy tracker, and can robustly classify trajectories that are abnormal with respect to the training data.


1.1 Motivation

With the falling cost of inexpensive sensors such as cameras the amount of data has increased substantially. Frequently, one wishes to classify streams of data based on the activity that is occurring. For example, surveillance applications may want to segment normal activities from abnormal activities for purposes of data retrieval after a crime has occurred. Defining activities usually relies on labeled training data. Labeling training data is a tedious task, and thus a technique that minimizes the amount of labeling would be beneficial for practical use.

1.2 Introduction

In this research we propose a technique that relies on providing a single label for the activity present in a sequence of video, as opposed to the labeling of relevant features in the video. Given a trajectory of feature measurements and a label for a high-level event, our system first clusters the features to discover the lower-level events. We suppose that a high-level event is defined by a sequence of low-level events. In a given high level event, the tracked object will �walk� from one low level cluster to another. We then the probability of a given trajectory through low-level events given a particular high level event to determine the high level activity.


Figure 1. Clustering of low-level states.


1.3 Our method

During the training phase we use the deterministic annealing technique developed by Rose for clustering. This essentially discretizes the space into a finite number of states, where each data point has fuzzy membership to each cluster. Next, we use this result to initialize a hierarchically constrained dynamic Bayesian network akin to Murphy�s representation of a hierarchical hidden Markov model. Finally, we constrain the lowest level of the hierarchy to follow a Coxian phase distribution in order to robustly model low-level activities of varying duration.

Figure 2. Pictorial representation of low-level HMM from clustering. This constitutes one level of the d-level hierarchical dynamic Bayesian network.

Finally, we use each feature trajectory in the training data to learn the relevant parameters, i.e. transition probabilities and observation likelihoods. We learn the unobservable parameters using the EM algorithm.

1.4 Experimental Results

We tested our algorithm using the video clips of a shopping center in Portugal that we found in [18]. We identify three high level activities: Entering the shop, leaving the shop, and passing the shop. For each video we also ran a multiple hypothesis tracker to track people in the image plane, providing us with features. We randomly set aside 8 tracks for training (3 entering, 3 leaving, 2 passing) and 6 tracks for testing. We labeled each frame in the training data with the appropriate high-level event.





Figure 3. Results for entering the shop (a), leaving the shop (b), and passing the shop (c). The x-axis is the frame number and the y-axis shows the probability distribution over the three types of events given all of the measurements up until that frame number. In (d) we show the probabilitiy distribution over low-level events for the passing trajectory. Changes in low-level activity correspond to jumps in probability for the high level activity.

The results in figure 3 are qualitative in nature but suggest that our technique is doing a good job in recognizing activity. We have tested abnormal trajectories through this model and received good classifications of the activities in the video. In future work we plan on obtaining quantitative results and examining how to explicitly distinguish abnormal activity from normal activity.


J. Muncaster and Y. Ma, Activity recognition using dynamic Bayesian networks with automatic state selection, Submitted to Workshop on Motion and Video Computing, Austin, TX, 2007

J. Muncaster, Classification of abnormal activities in video, Graduate Student Research Conference, UCSB, 2006.