Learning depth from a single image using visual-depth words

Sunok Kim, Sunghwan Choi, Kwanghoon Sohn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Estimating depth from a single monocular image is a fundamental problem in computer vision. Traditional methods for such estimation usually require complicated and sometimes labor-intensive processing. In this paper, we propose a new perspective for this problem and suggest a new gradient-domain learning framework which is much simpler and more efficient. Inspired by the observation that there is substantial co-occurrence of image edges and depth discontinuities in natural scenes, we learn the relationship between local appearance features and corresponding depth gradients by making use of the K-means clustering algorithm within the image feature space. We then encode each cluster centroid with its associated depth gradients, which defines visual-depth words that model the image-depth relationship very well. This enables one to estimate the scene depth for an arbitrary image by simply selecting proper depth gradients from a compact dictionary of visual-depth words, followed by a Poisson surface reconstruction. Experimental results demonstrate that the proposed gradient-domain approach outperforms state-of-the-art methods both qualitatively and quantitatively and is generic over (unseen) scene categories which are not used for training.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Image Processing, ICIP 2015 - Proceedings
PublisherIEEE Computer Society
Pages1895-1899
Number of pages5
ISBN (Electronic)9781479983391
DOIs
Publication statusPublished - 2015 Dec 9
EventIEEE International Conference on Image Processing, ICIP 2015 - Quebec City, Canada
Duration: 2015 Sep 272015 Sep 30

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2015-December
ISSN (Print)1522-4880

Other

OtherIEEE International Conference on Image Processing, ICIP 2015
CountryCanada
CityQuebec City
Period15/9/2715/9/30

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Kim, S., Choi, S., & Sohn, K. (2015). Learning depth from a single image using visual-depth words. In 2015 IEEE International Conference on Image Processing, ICIP 2015 - Proceedings (pp. 1895-1899). [7351130] (Proceedings - International Conference on Image Processing, ICIP; Vol. 2015-December). IEEE Computer Society. https://doi.org/10.1109/ICIP.2015.7351130