Finding views with a good composition from an input image is a common but challenging problem. There are usually at least dozens of candidates (regions) in an image, and how to evaluate these candidates is subjective. Most existing methods only use the feature corresponding to each candidate to evaluate the quality. However, the mutual relations between the candidates from an image play an essential role in composing a good shot due to the comparative nature of this problem. Motivated by this, we propose a graph-based module with a gated feature update to model the relations between different candidates. The candidate region features are propagated on a graph that models mutual relations between different regions for mining the useful information such that the relation features and region features are adaptively fused. We design a multi-task loss to train the model, especially, a regularization term is adopted to incorporate the prior knowledge about the relations into the graph. A data augmentation method is also developed by mixing nodes from different graphs to improve the model generalization ability. Experimental results show that the proposed model performs favorably against state-of-the-art methods, and comprehensive ablation studies demonstrate the contribution of each module and graph-based inference of the proposed method.
|Number of pages||10|
|Journal||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|Publication status||Published - 2020|
|Event||2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States|
Duration: 2020 Jun 14 → 2020 Jun 19
Bibliographical noteFunding Information:
This work is funded by the National Natural Science Foundation of China (Grant 61876181, Grant 61673375, and Grant 61721004), the Projects of Chinese Academy of Sciences (Grant QYZDB-SSW-JSC006), and the NSF Career Grant (1149783). Debang is also supported by China Scholarship Council (CSC).
© 2020 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition