Abstract
Diagrams often depict complex phenomena and serve as a good test bed for visual and textual reasoning. However, understanding diagrams using natural image understanding approaches requires large training datasets of diagrams, which are very hard to obtain. Instead, this can be addressed as a matching problem either between labeled diagrams, images or both. This problem is very challenging since the absence of significant color and texture renders local cues ambiguous and requires global reasoning. We consider the problem of one-shot part labeling: labeling multiple parts of an object in a target image given only a single source image of that category. For this set-to-set matching problem, we introduce the Structured Set Matching Network (SSMN), a structured prediction model that incorporates convolutional neural networks. The SSMN is trained using global normalization to maximize local match scores between corresponding elements and a global consistency score among all matched elements, while also enforcing a matching constraint between the two sets. The SSMN significantly outperforms several strong baselines on three label transfer scenarios: diagram-to-diagram, evaluated on a new diagram dataset of over 200 categories; image-to-image, evaluated on a dataset built on top of the Pascal Part Dataset; and image-to-diagram, evaluated on transferring labels across these datasets.
Original language | English |
---|---|
Title of host publication | Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 |
Publisher | IEEE Computer Society |
Pages | 3627-3636 |
Number of pages | 10 |
ISBN (Electronic) | 9781538664209 |
DOIs | |
Publication status | Published - 2018 Dec 14 |
Event | 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States Duration: 2018 Jun 18 → 2018 Jun 22 |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
Conference
Conference | 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 |
---|---|
Country/Territory | United States |
City | Salt Lake City |
Period | 18/6/18 → 18/6/22 |
Bibliographical note
Funding Information:This work is in part supported by ONR N00014- 13-1-0720, NSF IIS-1338054, NSF-1652052, NRI-1637479, Allen Distinguished Investigator Award, and the Allen Institute for Artificial Intelligence. JC would like to thank Christopher B. Choy (for the help in comparing with the UCN), Kai Han, Rafael S. de Rezende and Minsu Cho (for the discussion about SCNet) and Seunghoon Hong (for an initial discussion).
Publisher Copyright:
© 2018 IEEE.
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition