Rendering portraitures from monocular camera and beyond

Xiangyu Xu, Deqing Sun, Sifei Liu, Wenqi Ren, Yu Jin Zhang, Ming Hsuan Yang, Jian Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Shallow Depth-of-Field (DoF) is a desirable effect in photography which renders artistic photos. Usually, it requires single-lens reflex cameras and certain photography skills to generate such effects. Recently, dual-lens on cellphones is used to estimate scene depth and simulate DoF effects for portrait shots. However, this technique cannot be applied to photos already taken and does not work well for whole-body scenes where the subject is at a distance from the cameras. In this work, we introduce an automatic system that achieves portrait DoF rendering for monocular cameras. Specifically, we first exploit Convolutional Neural Networks to estimate the relative depth and portrait segmentation maps from a single input image. Since these initial estimates from a single input are usually coarse and lack fine details, we further learn pixel affinities to refine the coarse estimation maps. With the refined estimation, we conduct depth and segmentation-aware blur rendering to the input image with a Conditional Random Field and image matting. In addition, we train a spatially-variant Recursive Neural Network to learn and accelerate this rendering process. We show that the proposed algorithm can effectively generate portraitures with realistic DoF effects using one single input. Experimental results also demonstrate that our depth and segmentation estimation modules perform favorably against the state-of-the-art methods both quantitatively and qualitatively.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsMartial Hebert, Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss
PublisherSpringer Verlag
Pages36-51
Number of pages16
ISBN (Print)9783030012397
DOIs
Publication statusPublished - 2018 Jan 1
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 2018 Sep 82018 Sep 14

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11213 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period18/9/818/9/14

Fingerprint

Depth of Field
Rendering
Camera
Cameras
Photography
Segmentation
Camera lenses
Neural networks
Lens
MAP Estimation
Estimate
Neural Networks
Conditional Random Fields
Lenses
Pixels
Accelerate
Affine transformation
Pixel
Module
Experimental Results

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Xu, X., Sun, D., Liu, S., Ren, W., Zhang, Y. J., Yang, M. H., & Sun, J. (2018). Rendering portraitures from monocular camera and beyond. In M. Hebert, V. Ferrari, C. Sminchisescu, & Y. Weiss (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 36-51). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01240-3_3
Xu, Xiangyu ; Sun, Deqing ; Liu, Sifei ; Ren, Wenqi ; Zhang, Yu Jin ; Yang, Ming Hsuan ; Sun, Jian. / Rendering portraitures from monocular camera and beyond. Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. editor / Martial Hebert ; Vittorio Ferrari ; Cristian Sminchisescu ; Yair Weiss. Springer Verlag, 2018. pp. 36-51 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{6d683ca6fe4344638792ad1b85d49881,
title = "Rendering portraitures from monocular camera and beyond",
abstract = "Shallow Depth-of-Field (DoF) is a desirable effect in photography which renders artistic photos. Usually, it requires single-lens reflex cameras and certain photography skills to generate such effects. Recently, dual-lens on cellphones is used to estimate scene depth and simulate DoF effects for portrait shots. However, this technique cannot be applied to photos already taken and does not work well for whole-body scenes where the subject is at a distance from the cameras. In this work, we introduce an automatic system that achieves portrait DoF rendering for monocular cameras. Specifically, we first exploit Convolutional Neural Networks to estimate the relative depth and portrait segmentation maps from a single input image. Since these initial estimates from a single input are usually coarse and lack fine details, we further learn pixel affinities to refine the coarse estimation maps. With the refined estimation, we conduct depth and segmentation-aware blur rendering to the input image with a Conditional Random Field and image matting. In addition, we train a spatially-variant Recursive Neural Network to learn and accelerate this rendering process. We show that the proposed algorithm can effectively generate portraitures with realistic DoF effects using one single input. Experimental results also demonstrate that our depth and segmentation estimation modules perform favorably against the state-of-the-art methods both quantitatively and qualitatively.",
author = "Xiangyu Xu and Deqing Sun and Sifei Liu and Wenqi Ren and Zhang, {Yu Jin} and Yang, {Ming Hsuan} and Jian Sun",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-01240-3_3",
language = "English",
isbn = "9783030012397",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "36--51",
editor = "Martial Hebert and Vittorio Ferrari and Cristian Sminchisescu and Yair Weiss",
booktitle = "Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings",
address = "Germany",

}

Xu, X, Sun, D, Liu, S, Ren, W, Zhang, YJ, Yang, MH & Sun, J 2018, Rendering portraitures from monocular camera and beyond. in M Hebert, V Ferrari, C Sminchisescu & Y Weiss (eds), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11213 LNCS, Springer Verlag, pp. 36-51, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 18/9/8. https://doi.org/10.1007/978-3-030-01240-3_3

Rendering portraitures from monocular camera and beyond. / Xu, Xiangyu; Sun, Deqing; Liu, Sifei; Ren, Wenqi; Zhang, Yu Jin; Yang, Ming Hsuan; Sun, Jian.

Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. ed. / Martial Hebert; Vittorio Ferrari; Cristian Sminchisescu; Yair Weiss. Springer Verlag, 2018. p. 36-51 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Rendering portraitures from monocular camera and beyond

AU - Xu, Xiangyu

AU - Sun, Deqing

AU - Liu, Sifei

AU - Ren, Wenqi

AU - Zhang, Yu Jin

AU - Yang, Ming Hsuan

AU - Sun, Jian

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Shallow Depth-of-Field (DoF) is a desirable effect in photography which renders artistic photos. Usually, it requires single-lens reflex cameras and certain photography skills to generate such effects. Recently, dual-lens on cellphones is used to estimate scene depth and simulate DoF effects for portrait shots. However, this technique cannot be applied to photos already taken and does not work well for whole-body scenes where the subject is at a distance from the cameras. In this work, we introduce an automatic system that achieves portrait DoF rendering for monocular cameras. Specifically, we first exploit Convolutional Neural Networks to estimate the relative depth and portrait segmentation maps from a single input image. Since these initial estimates from a single input are usually coarse and lack fine details, we further learn pixel affinities to refine the coarse estimation maps. With the refined estimation, we conduct depth and segmentation-aware blur rendering to the input image with a Conditional Random Field and image matting. In addition, we train a spatially-variant Recursive Neural Network to learn and accelerate this rendering process. We show that the proposed algorithm can effectively generate portraitures with realistic DoF effects using one single input. Experimental results also demonstrate that our depth and segmentation estimation modules perform favorably against the state-of-the-art methods both quantitatively and qualitatively.

AB - Shallow Depth-of-Field (DoF) is a desirable effect in photography which renders artistic photos. Usually, it requires single-lens reflex cameras and certain photography skills to generate such effects. Recently, dual-lens on cellphones is used to estimate scene depth and simulate DoF effects for portrait shots. However, this technique cannot be applied to photos already taken and does not work well for whole-body scenes where the subject is at a distance from the cameras. In this work, we introduce an automatic system that achieves portrait DoF rendering for monocular cameras. Specifically, we first exploit Convolutional Neural Networks to estimate the relative depth and portrait segmentation maps from a single input image. Since these initial estimates from a single input are usually coarse and lack fine details, we further learn pixel affinities to refine the coarse estimation maps. With the refined estimation, we conduct depth and segmentation-aware blur rendering to the input image with a Conditional Random Field and image matting. In addition, we train a spatially-variant Recursive Neural Network to learn and accelerate this rendering process. We show that the proposed algorithm can effectively generate portraitures with realistic DoF effects using one single input. Experimental results also demonstrate that our depth and segmentation estimation modules perform favorably against the state-of-the-art methods both quantitatively and qualitatively.

UR - http://www.scopus.com/inward/record.url?scp=85055108875&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055108875&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01240-3_3

DO - 10.1007/978-3-030-01240-3_3

M3 - Conference contribution

AN - SCOPUS:85055108875

SN - 9783030012397

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 36

EP - 51

BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings

A2 - Hebert, Martial

A2 - Ferrari, Vittorio

A2 - Sminchisescu, Cristian

A2 - Weiss, Yair

PB - Springer Verlag

ER -

Xu X, Sun D, Liu S, Ren W, Zhang YJ, Yang MH et al. Rendering portraitures from monocular camera and beyond. In Hebert M, Ferrari V, Sminchisescu C, Weiss Y, editors, Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 36-51. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-01240-3_3