Automatic 2D-to-3D conversion using multi-scale deep neural network

Jiyoung Lee, Hyungjoo Jung, Youngjung Kim, Kwanghoon Sohn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PublisherIEEE Computer Society
Pages730-734
Number of pages5
ISBN (Electronic)9781509021758
DOIs
Publication statusPublished - 2018 Feb 20
Event24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China
Duration: 2017 Sep 172017 Sep 20

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2017-September
ISSN (Print)1522-4880

Other

Other24th IEEE International Conference on Image Processing, ICIP 2017
CountryChina
CityBeijing
Period17/9/1717/9/20

Fingerprint

Neural networks
Image reconstruction
Deep neural networks

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Lee, J., Jung, H., Kim, Y., & Sohn, K. (2018). Automatic 2D-to-3D conversion using multi-scale deep neural network. In 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings (pp. 730-734). (Proceedings - International Conference on Image Processing, ICIP; Vol. 2017-September). IEEE Computer Society. https://doi.org/10.1109/ICIP.2017.8296377
Lee, Jiyoung ; Jung, Hyungjoo ; Kim, Youngjung ; Sohn, Kwanghoon. / Automatic 2D-to-3D conversion using multi-scale deep neural network. 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings. IEEE Computer Society, 2018. pp. 730-734 (Proceedings - International Conference on Image Processing, ICIP).
@inproceedings{8afd53d0bfb2449184b9b3f4d4e09e59,
title = "Automatic 2D-to-3D conversion using multi-scale deep neural network",
abstract = "We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.",
author = "Jiyoung Lee and Hyungjoo Jung and Youngjung Kim and Kwanghoon Sohn",
year = "2018",
month = "2",
day = "20",
doi = "10.1109/ICIP.2017.8296377",
language = "English",
series = "Proceedings - International Conference on Image Processing, ICIP",
publisher = "IEEE Computer Society",
pages = "730--734",
booktitle = "2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings",
address = "United States",

}

Lee, J, Jung, H, Kim, Y & Sohn, K 2018, Automatic 2D-to-3D conversion using multi-scale deep neural network. in 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings. Proceedings - International Conference on Image Processing, ICIP, vol. 2017-September, IEEE Computer Society, pp. 730-734, 24th IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, 17/9/17. https://doi.org/10.1109/ICIP.2017.8296377

Automatic 2D-to-3D conversion using multi-scale deep neural network. / Lee, Jiyoung; Jung, Hyungjoo; Kim, Youngjung; Sohn, Kwanghoon.

2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings. IEEE Computer Society, 2018. p. 730-734 (Proceedings - International Conference on Image Processing, ICIP; Vol. 2017-September).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Automatic 2D-to-3D conversion using multi-scale deep neural network

AU - Lee, Jiyoung

AU - Jung, Hyungjoo

AU - Kim, Youngjung

AU - Sohn, Kwanghoon

PY - 2018/2/20

Y1 - 2018/2/20

N2 - We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.

AB - We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.

UR - http://www.scopus.com/inward/record.url?scp=85045319751&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045319751&partnerID=8YFLogxK

U2 - 10.1109/ICIP.2017.8296377

DO - 10.1109/ICIP.2017.8296377

M3 - Conference contribution

AN - SCOPUS:85045319751

T3 - Proceedings - International Conference on Image Processing, ICIP

SP - 730

EP - 734

BT - 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings

PB - IEEE Computer Society

ER -

Lee J, Jung H, Kim Y, Sohn K. Automatic 2D-to-3D conversion using multi-scale deep neural network. In 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings. IEEE Computer Society. 2018. p. 730-734. (Proceedings - International Conference on Image Processing, ICIP). https://doi.org/10.1109/ICIP.2017.8296377