Image generation from scene description is a cornerstone technique for the controlled generation, which is beneficial to applications such as content creation and image editing. In this work, we aim to synthesize images from scene description with retrieved patches as reference. We propose a differentiable retrieval module. With the differentiable retrieval module, we can (1) make the entire pipeline end-to-end trainable, enabling the learning of better feature embedding for retrieval; (2) encourage the selection of mutually compatible patches with additional objective functions. We conduct extensive quantitative and qualitative experiments to demonstrate that the proposed method can generate realistic and diverse images, where the retrieved patches are reasonable and mutually compatible.
|Title of host publication||Computer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings|
|Editors||Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm|
|Publisher||Springer Science and Business Media Deutschland GmbH|
|Number of pages||16|
|Publication status||Published - 2020|
|Event||16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom|
Duration: 2020 Aug 23 → 2020 Aug 28
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||16th European Conference on Computer Vision, ECCV 2020|
|Period||20/8/23 → 20/8/28|
Bibliographical noteFunding Information:
supported in part by the NSF CAREER Grant
This work is supported in part by the NSF CAREER Grant #1149783.
© 2020, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)