Evaluating Display Types for AR Selection and Annotation

Evaluating Display Types for AR Selection and Annotation

Jason Wither, Stephen DiVerdi, Tobias Höllerer


Traditionally, augmented reality (AR) applications have primarily used head mounted displays (HMDs) for visual output. Most designers want the AR experience to be a seamless and mode-less integration between real and virtual worlds. Many argue that this can best be facilitated using HMDs, which augment the user’s visual field directly and persistently. No further user-controlled tools are necessary to visualize the virtual layer. This is particularly useful for hands-free operation and interaction with dynamic virtual content.

Recently, there have also been an increasing number of AR applications that use hand held displays such as ultra-mobile computers, PDAs, or cell phones as the primary display. The reason most often cited for using hand held displays is user acceptance. Proponents of this technology claim that for AR to be broadly adopted in the short term, it will need to run on devices similar to those that are already ubiquitous.

Little is known about the relationship between display type and user performance in AR. In this project, we describe a user study that examines how display choice affects selection and annotation in an AR environment. These general tasks can be broken up into two conceptual parts. First the user must search for and locate the object they wish to select or annotate, and second they must move some sort of cursor or selection device to that location. One difference that sometimes exists between selection and annotation is that it is often necessary to assign a distance to place an annotation, as well as a direction vector. In the past, we have looked at techniques for creating annotations at a distance, including techniques to determine the distance to the object being annotated, but for this study we are only concerned with selection or annotation on the image plane, and assume that there is sufficient world model information for finding the 3D location of objects using techniques like ray casting.

This project compares two different tasks, selection and visual search, among three representative display types. The selection task tested how easily people would move the display to center it on target objects. For the search task, we had users look for items among both real and virtual objects in order to determine if display choice has different effects based on the amount of virtual content present in the AR scene. The first display device we used is a head mounted display, and the second is a hand held display, which we used in two different configurations. First, users were asked to hold the hand held display at approximately waist height and look down at it, like they would do with a tablet computer. The camera was pointed directly forward. The second configuration involved holding the hand held display like a magic lens where the camera is pointing behind the display, with users holding the display up at head height, and looking “through” it.

In our original study we found several significant results. We found that for certain annotation and selection tasks, a magic lens may be more suitable than a HMD. It performed faster for the cursor movement portion of the study and no worse than the others during the visual search part of the study. For the visual search portion of our study, we found it surprising that there was no significant difference in either the virtual or real case in task performance among the different displays. In spite of that, users had strong feelings that they had performed better with particular displays. They favored the hand held displays that they could look away from when searching for real objects, and the HMD when searching for virtual objects. These results suggest to us that there were other factors involved that caused the task performance to be so similar. Perhaps the user’s high task attention overwhelmed the smaller differences caused by visual difference among displays.

Ongoing Work

We are planning a follow up study for this project to further clarify our results. We hope that including a secondary distraction task will break users focus on then task, simulating a user who is switching between different tasks within an application. We also plan to require the users to conduct the visual search task over a largera, making it necessary for users to move the display device far more than they did in the original study.


Jason Wither, Stephen DiVerdi, and Tobias Höllerer
Evaluating Display Types for AR Selection and Annotation
In Proc. International Symposium on Mixed and Augmented Reality, November 2007
(PDF) (Video)