|Evaluating and Improving Visual Tracking|
Evaluating and Improving Visual Tracking
Steffen Gauglitz • Tobias Höllerer • Matthew Turk
Visual tracking is a core component for a variety of applications, such as visual navigation of autonomous vehicles and Augmented Reality. We are working on evaluating and improving different algorithms needed for visual tracking.
Datasets & Ground Truth
To methodically evaluate tracking, we need large datasets of relevant data and ground truth information against which the algorithms to be tested can be evaluated.
While there are many existing image/video datasets, most of them have limited validity for visual tracking (e.g. single images instead of videos, no motion blur). The first part of our work therefore was to design an appropriate setup to collect this data. Our setup was first presented at ISMAR 2009 .
Currently, our dataset consists of 96 video streams featuring 6 different textures and 16 different camera paths, with a total of 6889 frames, including ground truth information for each frame. We separated the motions (e.g. rotation only, then zoom only...) to allow for a more detailed analysis – to be able to not only answer the question: how often does tracking break, but also: why or under what condition does it break? Our dataset has been published as part of our IJCV article  and is available for download.
Testing Existing Algorithms
With the dataset described on the left, we can evaluate a variety of existing tracking algorithms and individual components of these algorithms. In particular, we extensively evaluated and analyzed keypoint detectors and feature descriptors.
We evaluated 6 popular keypoint detectors, 5 popular feature descriptors/classifiers, and finally, all 30 combinations. This work has been published in the International Journal of Computer Vision .
The results can be used to analyze strengths & weaknesses of these algorithms, choose an appropriate algorithm for a particular application, or stimulate ideas for improvements (cf. next section).
Improving Spatial Distribution of Keypoints
We show that improving the spatial distribution of keypoints increases the robustness of visual tracking, and we propose a novel algorithm that can compute such a distribution significantly faster than previous methods.
Keypoints have to fulfill two conflicting criteria: they have to be “strong features” (e.g., corners with high contrast), but they shall also be well-distributed across the image, since this improves robustness of various algorithms that make use of them. Our algorithm efficiently selects a subset of points that fulfill above criteria from a larger set of detected points, and it does so significantly faster than existing methods – in particular, in O(n log n) time instead of O(n^2) time.
This work has been accepted to and will be presented at ICIP 2011 .
Improving Keypoint Orientation Assignments
Keypoint detection, description and matching has proven to be a powerful paradigm for a variety of applications in computer vision. In many frameworks, this paradigm includes an orientation assignment step to make the overall process invariant to in-plane rotation. While this approach seems to work well and is widely accepted, the orientation assignment is frequently presented as a mere “add-on” to a descriptor, and little work has been devoted to the orientation assignment algorithms themselves.
In a paper presented at BMVC 2011 , we proposed two novel, very efficient algorithms for orientation assignment (one if a single dominant orientation is sought, and the second capable of detecting multiple dominant orientations), and present a detailed quantitative evaluation and analysis of our two algorithms as well as four competing algorithms. Our results entail observations about the orientation assignment problem in general as well as observations about individual algorithms.