Inferring scene depth from a single monocular image is a highly ill-posed problem in computer vision. This paper presents a new gradient-domain approach, called depth analogy, that makes use of analogy as a means for synthesizing a target depth field, when a collection of RGB-D image pairs is given as training data. Specifically, the proposed method employs a non-parametric learning process that creates an analogous depth field by sampling reliable depth gradients using visual correspondence established on training image pairs. Unlike existing data-driven approaches that directly select depth values from training data, our framework transfers depth gradients as reconstruction cues, which are then integrated by the Poisson reconstruction. The performance of most conventional approaches relies heavily on the training RGB-D data used in the process, and such a dependency severely degenerates the quality of reconstructed depth maps when the desired depth distribution of an input image is quite different from that of the training data, e.g., outdoor versus indoor scenes. Our key observation is that using depth gradients in the reconstruction is less sensitive to scene characteristics, providing better cues for depth recovery. Thus, our gradient-domain approach can support a great variety of training range datasets that involve substantial appearance and geometric variations. The experimental results demonstrate that our (depth) gradient-domain approach outperforms existing data-driven approaches directly working on depth domain, even when only uncorrelated training datasets are available.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design