Diverse Image-to-Image Translation via Disentangled Representations

Hsin Ying Lee, Hung Yu Tseng, Jia Bin Huang, Maneesh Singh, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

53 Citations (Scopus)

Abstract

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for many applications: (1) the lack of aligned training pairs and (2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for producing diverse outputs without paired training images. To achieve diversity, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Using the disentangled features as inputs greatly reduces mode collapse. To handle unpaired training data, we introduce a novel cross-cycle consistency loss. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks. We validate the effectiveness of our approach through extensive evaluation.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsMartial Hebert, Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss
PublisherSpringer Verlag
Pages36-52
Number of pages17
ISBN (Print)9783030012458
DOIs
Publication statusPublished - 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 2018 Sep 82018 Sep 14

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11205 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period18/9/818/9/14

Bibliographical note

Funding Information:
Acknowledgements. This work is supported in part by the NSF CAREER Grant #1149783, the NSF Grant #1755785, and gifts from Verisk, Adobe and Nvidia.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Diverse Image-to-Image Translation via Disentangled Representations'. Together they form a unique fingerprint.

Cite this