Multiflash Imaging and ApplicationsDaniel Vaquero, Rogerio Feris, Matthew Turk, Ramesh RaskarUCSB Four Eyes Lab, in collaboration with Mitsubishi Electric Research Labs and IBM Research |
| |||||||
OverviewWe propose a novel imaging framework, based on a camera with multiple flashes strategically positioned to cast shadows along depth discontinuities. This simple and inexpensive modification of the capture setup allows for efficient and accurate shape extraction. The usefulness of our techniques are then demonstrated in a variety of applications in computer vision, graphics, digital photography, and human-computer interaction. Examples include methods based on depth edges for non-photorealistic rendering, hand gesture recognition, improving 3D reconstruction, specular reflection reduction, and medical imaging. MotivationSharp discontinuities in depth, or depth edges, are directly related to the 3D scene geometry and provide extremely important low-level features for image understanding, since they tend to outline the boundaries of objects in the scene. In fact, they comprise one of the four components in the well-known 2 1/2-D sketch of Marr's computational vision model. Reliable detection of depth edges clearly facilitates segmentation, establishes depth-order relations, and provides valuable features for visual recognition, tracking, and 3D reconstruction. It can also be used for camera control (to help revealing new surfaces), and non-photorealistic rendering. Traditional edge detection techniques based on intensities (such as the Canny edge detector) are limited in their ability to reveal scene structure. Low contrast regions often pose difficulties to these methods, and it is hard to distinguish between discontinuities due to textured regions (texture edges) and depth edges. For example, here is a picture of a scene containing a hand in front of a flat background with an aerial photograph printed on it. The Canny edge operator captures texture discontinuities, while the result containing only the depth edges is much more informative with respect to the geometry of the scene.
An immediate way to detect depth discontinuities is to obtain a complete 3D map of the scene, and then search for sharp discontinuities in the depth values. However, the majority of 3D reconstruction methods produce inaccurate results near depth discontinuities, due to occlusions and the violation of smoothness constraints. Recently, steady progress has been made in discontinuity preserving stereo matching, mainly with global optimization algorithms based on belief propagation or graph cuts. However, these methods fail to capture depth edges associated with sufficiently small changes in depth. Moreover, obtaining clean, non-jagged contours along shape boundaries is still a challenging problem even for methods that rely on more expensive hardware. In this work, we propose a technique that directly detects depth edges, bypassing the 3D reconstruction step. The method is based on a camera with multiple flashes, strategically positioned to cast shadows along depth discontinuities. The technique combines the shadow information from multiple pictures taken using flashes at different positions to compute the depth edges. Depth Edge DetectionNow we illustrate the basic idea of the depth edge detection algorithm. Suppose that we have a camera with four flashes (above, below, left, right). Four pictures are taken, each of these under the illumination of a single flash. Notice that the shadows change from picture to picture. If the flash is to the left, the shadows are cast to the right of the objects; if the flash is above the camera, the shadows are below the objects, and so on. We then compute a max image by taking the pixelwise maximum of the four captured images, which approximates a shadow-free image of the scene.
The next step consists of dividing each of the captured images by the max image, resulting in four ratio images. The shadow regions are accentuated in these images, allowing for accurate detection. The light epipolar rays are then traversed in each of the ratio images, and large jumps in intensity are marked as depth edges. For example, if the flash is to the left, the image is traversed from left to right; if the flash is below, the image is traversed from the bottom to the top. The results for each of the images are combined to form the final result.
Here is the result of applying our technique to a scene containing a bone. Notice the high accuracy on the detection of depth edges, even in the presence of self occlusions.
Applications
Non-photorealistic rendering
(the images on the left are from the real scenes, and their corresponding artistic renderings are on the right) Medical imaging through a modified laparoscope
Recognition of hand gestures
(notice the difference between the Canny edges - in the middle - and the edges obtained with our technique - on the right) Discontinuity Preserving Stereo
Specular Reflection Reduction
(on the left: scene with specularities; on the right: our result) Ongoing ResearchOur recent work has been focused on acquiring a better theoretical understanding of the multiflash possibilities and limitations, and addressing the problem of depth edge detection in dynamic scenes.Publications
D. A. Vaquero, R. S. Feris, M. Turk, and R. Raskar. R. S. Feris, R. Raskar, L. Chen, K. Tan and M. Turk.
R. S. Feris, R. Raskar and M. Turk.
R. S. Feris. R. S. Feris, M. Turk, R. Raskar and K. Tan.
R. S. Feris, M. Turk, R. Raskar, K. Tan and G. Ohashi.
R. S. Feris, L. Chen, M. Turk, R. Raskar and K. Tan.
R. Raskar, K. Tan, R. S. Feris, J. Kobler, J. Yu and M. Turk.
R. S. Feris, R. Raskar, K. Tan and M. Turk. R. S. Feris, M. Turk, R. Raskar, K. Tan and G. Ohashi.
R. Raskar, K. Tan, R. S. Feris, J. Yu and M. Turk. K. Tan, J. Kobler, R. S. Feris, P. Dietz and R. Raskar. Videos
Depth Edge Detection with Multi-Flash Imaging Data SetsSample images can be downloaded from the NPR Camera Homepage (look for "Source and Images").Multiflash stereo datasets are available from the Multiflash Stereo web site. Further Information
NPR Camera Homepage AcknowledgementThis material is based upon work supported by the National Science Foundation under Grant No. 0535293. | ||||||||