TY - GEN
T1 - A study on the effects of RGB-D database scale and quality on depth analogy performance
AU - Kim, Sunok
AU - Kim, Youngjung
AU - Sohn, Kwanghoon
N1 - Publisher Copyright:
© 2016 SPIE.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2016
Y1 - 2016
N2 - In the past few years, depth estimation from a single image has received increased attentions due to its wide applicability in image and video understanding. For realizing these tasks, many approaches have been developed for estimating depth from a single image based on various depth cues such as shading, motion, etc. However, they failed to estimate plausible depth map when input color image is derived from different category in training images. To alleviate these problems, data-driven approaches have been popularly developed by leveraging the discriminative power of a large scale RGB-D database. These approaches assume that there exists appearance- depth correlation in natural scenes. However, this assumption is likely to be ambiguous when local image regions have similar appearance but different geometric placement within the scene. Recently, a depth analogy (DA) has been developed by using the correlation between color image and depth gradient. DA addresses depth ambiguity problem effectively and shows reliable performance. However, no experiments are conducted to investigate the relationship between database scale and the quality of the estimated depth map. In this paper, we extensively examine the effects of database scale and quality on the performance of DA method. In order to compare the quality of DA, we collect a large scale RGB-D database using Microsoft Kinect v1 and Kinect v2 on indoor and ZED stereo camera on outdoor environments. Since the depth map obtained by Kinect v2 has high quality compared to that of Kinect v1, the depth maps from the database from Kinect v2 are more reliable. It represents that the high quality and large scale RGB-D database guarantees the high quality of the depth estimation. The experimental results show that the high quality and large scale training database leads high quality estimated depth map in both indoor and outdoor scenes.
AB - In the past few years, depth estimation from a single image has received increased attentions due to its wide applicability in image and video understanding. For realizing these tasks, many approaches have been developed for estimating depth from a single image based on various depth cues such as shading, motion, etc. However, they failed to estimate plausible depth map when input color image is derived from different category in training images. To alleviate these problems, data-driven approaches have been popularly developed by leveraging the discriminative power of a large scale RGB-D database. These approaches assume that there exists appearance- depth correlation in natural scenes. However, this assumption is likely to be ambiguous when local image regions have similar appearance but different geometric placement within the scene. Recently, a depth analogy (DA) has been developed by using the correlation between color image and depth gradient. DA addresses depth ambiguity problem effectively and shows reliable performance. However, no experiments are conducted to investigate the relationship between database scale and the quality of the estimated depth map. In this paper, we extensively examine the effects of database scale and quality on the performance of DA method. In order to compare the quality of DA, we collect a large scale RGB-D database using Microsoft Kinect v1 and Kinect v2 on indoor and ZED stereo camera on outdoor environments. Since the depth map obtained by Kinect v2 has high quality compared to that of Kinect v1, the depth maps from the database from Kinect v2 are more reliable. It represents that the high quality and large scale RGB-D database guarantees the high quality of the depth estimation. The experimental results show that the high quality and large scale training database leads high quality estimated depth map in both indoor and outdoor scenes.
UR - http://www.scopus.com/inward/record.url?scp=84982273825&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84982273825&partnerID=8YFLogxK
U2 - 10.1117/12.2229600
DO - 10.1117/12.2229600
M3 - Conference contribution
AN - SCOPUS:84982273825
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Three-Dimensional Imaging, Visualization, and Display 2016
A2 - Javidi, Bahram
A2 - Son, Jung-Young
PB - SPIE
T2 - Three-Dimensional Imaging, Visualization, and Display 2016 Conference
Y2 - 18 April 2016 through 20 April 2016
ER -