We propose a novel imaging framework, based on a camera with multiple flashes strategically positioned to cast shadows along depth discontinuities. This simple and inexpensive modification of the capture setup allows for efficient and accurate shape extraction. The usefulness of our techniques are then demonstrated in a variety of applications in computer vision, graphics, digital photography, and human-computer interaction. Examples include methods based on depth edges for non-photorealistic rendering, hand gesture recognition, improving 3D reconstruction, specular reflection reduction, and medical imaging.
Sharp discontinuities in depth, or depth edges, are directly related to the 3D scene geometry and provide extremely important low-level features for image understanding, since they tend to outline the boundaries of objects in the scene. In fact, they comprise one of the four components in the well-known 2 1/2-D sketch of Marr's computational vision model. Reliable detection of depth edges clearly facilitates segmentation, establishes depth-order relations, and provides valuable features for visual recognition, tracking, and 3D reconstruction. It can also be used for camera control (to help revealing new surfaces), and non-photorealistic rendering.
Traditional edge detection techniques based on intensities (such as the Canny edge detector) are limited in their ability to reveal scene structure. Low contrast regions often pose difficulties to these methods, and it is hard to distinguish between discontinuities due to textured regions (texture edges) and depth edges. For example, here is a picture of a scene containing a hand in front of a flat background with an aerial photograph printed on it. The Canny edge operator captures texture discontinuities, while the result containing only the depth edges is much more informative with respect to the geometry of the scene.
An immediate way to detect depth discontinuities is to obtain a complete 3D map of the scene, and then search for sharp discontinuities in the depth values. However, the majority of 3D reconstruction methods produce inaccurate results near depth discontinuities, due to occlusions and the violation of smoothness constraints. Recently, steady progress has been made in discontinuity preserving stereo matching, mainly with global optimization algorithms based on belief propagation or graph cuts. However, these methods fail to capture depth edges associated with sufficiently small changes in depth. Moreover, obtaining clean, non-jagged contours along shape boundaries is still a challenging problem even for methods that rely on more expensive hardware.
In this work, we propose a technique that directly detects depth edges, bypassing the 3D reconstruction step. The method is based on a camera with multiple flashes, strategically positioned to cast shadows along depth discontinuities. The technique combines the shadow information from multiple pictures taken using flashes at different positions to compute the depth edges.
Depth Edge Detection
Now we illustrate the basic idea of the depth edge detection algorithm. Suppose that we have a camera with four flashes (above, below, left, right). Four pictures are taken, each of these under the illumination of a single flash. Notice that the shadows change from picture to picture. If the flash is to the left, the shadows are cast to the right of the objects; if the flash is above the camera, the shadows are below the objects, and so on. We then compute a max image by taking the pixelwise maximum of the four captured images, which approximates a shadow-free image of the scene.
The next step consists of dividing each of the captured images by the max image, resulting in four ratio images. The shadow regions are accentuated in these images, allowing for accurate detection. The light epipolar rays are then traversed in each of the ratio images, and large jumps in intensity are marked as depth edges. For example, if the flash is to the left, the image is traversed from left to right; if the flash is below, the image is traversed from the bottom to the top. The results for each of the images are combined to form the final result.
Here is the result of applying our technique to a scene containing a bone. Notice the high accuracy on the detection of depth edges, even in the presence of self occlusions.
(the images on the left are from the real scenes, and their corresponding artistic renderings are on the right)
Medical imaging through a modified laparoscope
Recognition of hand gestures
(notice the difference between the Canny edges - in the middle - and the edges obtained with our technique - on the right)
Discontinuity Preserving Stereo
Global Multiflash Stereo
Specular Reflection Reduction
(on the left: scene with specularities; on the right: our result)
Our recent work has been focused on acquiring a better theoretical understanding of the multiflash possibilities and limitations, and addressing the problem of depth edge detection in dynamic scenes.
D. A. Vaquero, R. Raskar, R. S. Feris, and M. Turk.
A Projector-Camera Setup for Geometry-Invariant Frequency Demultiplexing.
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09), Miami, Florida, June 2009.
D. A. Vaquero, R. S. Feris, M. Turk, and R. Raskar.
Characterizing the Shadow Space of Camera-Light Pairs.
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'08), Anchorage, Alaska, June 2008.
R. S. Feris, R. Raskar, L. Chen, K. Tan and M. Turk.
Multi-Flash Stereopsis: Depth Edge Preserving Stereo with Small Baseline Illumination.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 30, no. 1, pp. 147-159, 2008.
R. S. Feris, R. Raskar and M. Turk.
Dealing with Multi-scale Depth Changes and Motion in Depth Edge Detection.
Proceedings of SIBGRAPI'06 Brazilian Symposium on Computer Graphics and Image Processing, Manaus, Brazil, October 2006. IEEE Computer Society press.
Awarded "One of the Best Image Processing and Computer Vision Papers".
R. S. Feris.
Detection and Analysis of Depth Discontinuities with Lighting and Viewpoint Variation.
PhD thesis, University of California, Santa Barbara, 2006.
R. S. Feris, M. Turk, R. Raskar and K. Tan.
Specular Highlights Detection and Reduction with Multi-Flash Photography.
International Journal of the Brazilian Computer Society, vol. 1, no. 12, pp. 35-42, 2006.
R. S. Feris, M. Turk, R. Raskar, K. Tan and G. Ohashi.
Recognition of Isolated Fingerspelling Gestures Using Depth Edges.
B. Kisacanin, V. Pavlovic and T. Huang (eds.), Real-time Vision for Human-Computer Interaction, Springer-Verlag, 2005 - Book Chapter.
R. S. Feris, L. Chen, M. Turk, R. Raskar and K. Tan.
Discontinuity Preserving Stereo with Small Baseline Multi-Flash Illumination.
International Conference on Computer Vision (ICCV'05), Beijing, China, 2005.
R. Raskar, K. Tan, R. S. Feris, J. Kobler, J. Yu and M. Turk.
Harnessing Real-World Depth Edges with Multi-Flash Imaging.
IEEE Computer Graphics and Applications (IEEE CG&A), vol. 25, no. 1, pp. 32-38, January 2005.
R. S. Feris, R. Raskar, K. Tan and M. Turk.
Specular Reflection Reduction with Multi-Flash Imaging.
Proceedings of SIBGRAPI'04 Brazilian Symposium on Computer Graphics and Image Processing, Curitiba, Brazil, October 2004. IEEE Computer Society press.
R. S. Feris, M. Turk, R. Raskar, K. Tan and G. Ohashi.
Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition.
IEEE Workshop on Real-Time Vision for Human-Computer Interaction (in conjunction with CVPR'04), Washington DC, USA, June 2004.
R. Raskar, K. Tan, R. S. Feris, J. Yu and M. Turk.
Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging.
ACM Transactions on Graphics (SIGGRAPH 2004), Vol. 23, Issue 3, August 2004.
K. Tan, J. Kobler, R. S. Feris, P. Dietz and R. Raskar.
Shape Enhanced Surgical Visualizations and Medical Illustrations with Multi-flash Imaging.
International Conference on Medical Imaging Computing and Computer Assisted Intervention (MICCAI'04), Rennes, France 2004.
Depth Edge Detection with Multi-Flash Imaging
Characterizing the Shadow Space of Camera-Light Pairs
Sample images can be downloaded from the NPR Camera Homepage (look for "Source and Images").
Multiflash stereo datasets are available from the Multiflash Stereo web site.
NPR Camera Homepage
Rogerio Feris Homepage
Daniel Vaquero Homepage
This material is based upon work supported by the National Science Foundation under Grant No. 0535293.