|Live Tracking and Mapping from Both General and Rotation-Only Camera Motion|
Live Tracking and Mapping from Both General and Rotation-Only Camera Motion
Steffen Gauglitz • Chris Sweeney • Jonathan Ventura • Matthew Turk • Tobias Höllerer
Over the past decade, there has been a tremendous amount of work on real-time monocular vision-based tracking and mapping (T&M) systems, that is, systems that simultaneously determine the position and/or orientation of the camera with respect to a previously unknown environment and create a model of this environment. Aside from other applications, T&M is an important enabling technology for Augmented Reality (AR) in unprepared environments.
An important characteristic of a T&M system is the type of camera motion and the geometry of the environment that it supports. Simultaneous Localization and Mapping (SLAM) systems can deal with environments of arbitrary geometry and any camera motion that induces parallax (referred to as general camera motions). However, with few exceptions, they do not support rotation-only camera motion: Since SLAM systems are designed primarily to handle a traveling camera, their mapping is, intrinsically, built upon triangulation of features. Therefore, most SLAM systems need to be initialized with a distinct “traveling” movement of the camera for each newly observed part of the environment. This restriction is a major limitation for their use in AR, where the model building is assumed to be done in the background and ideally transparent to the user, who should not be required to move a certain way in order to make the system work. Moreover, rotation-only “looking around” is a very natural motion and may occur in many AR applications.
We present an approach to real-time tracking and mapping that supports any type of camera motion in 3D environments, that is, general (parallax-inducing) as well as rotation-only (degenerate) motions. Our approach effectively generalizes both a panorama mapping and tracking system and a keyframe-based Simultaneous Localization and Mapping (SLAM) system, behaving like one or the other depending on the camera movement. It seamlessly switches between the two modes, thus being able to track and map through arbitrary sequences of general and rotation-only camera movements and — fully automatically and in real time — creating combinations of 3D structure and panoramic maps, as shown in above example.
Below are examples for three different input video sequences (input video shown in top left corner). Each time, the algorithm automatically determines which kind of model (3D structure or panorama) is more suitable.
Video 1: combination of parallax-inducing and rotation-only movement
Video 2: rotation-only movement
In this case, the algorithm loses track a few times due to quick and jerky camera movement, but immediately starts 'stitching' a new partial panorama, and automatically combines the pieces later (see paper for details).
Video 3: parallax-inducing movement