Predicting scene depth (or geometric information) from single monocular images is a challenging task. This paper addresses such challenging and essentially ill-posed problem by regression on samples for which the depth is known. In this regard, we first retrieve semantically similar RGB and depth pairs from datasets using a deep convolutional activation feature. We show that our framework provides a richer foundation for depth estimation than existing hand-craft representations. Subsequently, an initial estimation is then integrated by block-matching and robust patch regression. It assigns perceptually appropriate depth values to an input query in accordance with a data-driven depth prior. A final post processor aligns depth maps with RGB discontinuities, resulting in visually plausible results. Experiments on the Make 3D and NYU RGB-D datasets show competitive results compared to recent state-of-The-Art methods.
|Journal||IS and T International Symposium on Electronic Imaging Science and Technology|
|Publication status||Published - 2016|
|Event||27th Annual Stereoscopic Displays and Applications Conference, SD and A 2016 - San Francisco, United States|
Duration: 2016 Feb 14 → 2016 Feb 18
Bibliographical notePublisher Copyright:
© 2016 Society for Imaging Science and Technology.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design
- Computer Science Applications
- Human-Computer Interaction
- Electrical and Electronic Engineering
- Atomic and Molecular Physics, and Optics