SCNet: Learning Semantic Correspondence

Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning a geometrically plausible model for semantic correspondence. SCNet uses region proposals as matching primitives, and explicitly incorporates geometric consistency in its loss function. It is trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and a comparative evaluation on several standard benchmarks demonstrates that the proposed approach substantially outperforms both recent deep learning architectures and previous methods based on hand-crafted features.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1849-1858
Number of pages10
ISBN (Electronic)9781538610329
DOIs
Publication statusPublished - 2017 Dec 22
Event16th IEEE International Conference on Computer Vision, ICCV 2017 - Venice, Italy
Duration: 2017 Oct 222017 Oct 29

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
Volume2017-October
ISSN (Print)1550-5499

Other

Other16th IEEE International Conference on Computer Vision, ICCV 2017
CountryItaly
CityVenice
Period17/10/2217/10/29

Fingerprint

Semantics
Network architecture
Volatile organic compounds
Neural networks
Deep learning

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Han, K., Rezende, R. S., Ham, B., Wong, K. Y. K., Cho, M., Schmid, C., & Ponce, J. (2017). SCNet: Learning Semantic Correspondence. In Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017 (pp. 1849-1858). [8237465] (Proceedings of the IEEE International Conference on Computer Vision; Vol. 2017-October). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV.2017.203
Han, Kai ; Rezende, Rafael S. ; Ham, Bumsub ; Wong, Kwan Yee K. ; Cho, Minsu ; Schmid, Cordelia ; Ponce, Jean. / SCNet : Learning Semantic Correspondence. Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 1849-1858 (Proceedings of the IEEE International Conference on Computer Vision).
@inproceedings{514323eec07b479c9f6592ee0996793f,
title = "SCNet: Learning Semantic Correspondence",
abstract = "This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning a geometrically plausible model for semantic correspondence. SCNet uses region proposals as matching primitives, and explicitly incorporates geometric consistency in its loss function. It is trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and a comparative evaluation on several standard benchmarks demonstrates that the proposed approach substantially outperforms both recent deep learning architectures and previous methods based on hand-crafted features.",
author = "Kai Han and Rezende, {Rafael S.} and Bumsub Ham and Wong, {Kwan Yee K.} and Minsu Cho and Cordelia Schmid and Jean Ponce",
year = "2017",
month = "12",
day = "22",
doi = "10.1109/ICCV.2017.203",
language = "English",
series = "Proceedings of the IEEE International Conference on Computer Vision",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1849--1858",
booktitle = "Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017",
address = "United States",

}

Han, K, Rezende, RS, Ham, B, Wong, KYK, Cho, M, Schmid, C & Ponce, J 2017, SCNet: Learning Semantic Correspondence. in Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017., 8237465, Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, Institute of Electrical and Electronics Engineers Inc., pp. 1849-1858, 16th IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 17/10/22. https://doi.org/10.1109/ICCV.2017.203

SCNet : Learning Semantic Correspondence. / Han, Kai; Rezende, Rafael S.; Ham, Bumsub; Wong, Kwan Yee K.; Cho, Minsu; Schmid, Cordelia; Ponce, Jean.

Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 1849-1858 8237465 (Proceedings of the IEEE International Conference on Computer Vision; Vol. 2017-October).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - SCNet

T2 - Learning Semantic Correspondence

AU - Han, Kai

AU - Rezende, Rafael S.

AU - Ham, Bumsub

AU - Wong, Kwan Yee K.

AU - Cho, Minsu

AU - Schmid, Cordelia

AU - Ponce, Jean

PY - 2017/12/22

Y1 - 2017/12/22

N2 - This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning a geometrically plausible model for semantic correspondence. SCNet uses region proposals as matching primitives, and explicitly incorporates geometric consistency in its loss function. It is trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and a comparative evaluation on several standard benchmarks demonstrates that the proposed approach substantially outperforms both recent deep learning architectures and previous methods based on hand-crafted features.

AB - This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning a geometrically plausible model for semantic correspondence. SCNet uses region proposals as matching primitives, and explicitly incorporates geometric consistency in its loss function. It is trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and a comparative evaluation on several standard benchmarks demonstrates that the proposed approach substantially outperforms both recent deep learning architectures and previous methods based on hand-crafted features.

UR - http://www.scopus.com/inward/record.url?scp=85041907470&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041907470&partnerID=8YFLogxK

U2 - 10.1109/ICCV.2017.203

DO - 10.1109/ICCV.2017.203

M3 - Conference contribution

AN - SCOPUS:85041907470

T3 - Proceedings of the IEEE International Conference on Computer Vision

SP - 1849

EP - 1858

BT - Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Han K, Rezende RS, Ham B, Wong KYK, Cho M, Schmid C et al. SCNet: Learning Semantic Correspondence. In Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 1849-1858. 8237465. (Proceedings of the IEEE International Conference on Computer Vision). https://doi.org/10.1109/ICCV.2017.203