Recurrent transformer networks for semantic correspondence

Seungryong Kim, Stephen Lin, Sangryul Jeon, Dongbo Min, Kwanghoon Sohn

Research output: Contribution to journalConference article

5 Citations (Scopus)

Abstract

We present recurrent transformer networks (RTNs) for obtaining dense correspondences between semantically similar images. Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations. By directly estimating the transformations between an image pair, rather than employing spatial transformer networks to independently normalize each individual image, we show that greater accuracy can be achieved. This process is conducted in a recursive manner to refine both the transformation estimates and the feature representations. In addition, a technique is presented for weakly-supervised training of RTNs that is based on a proposed classification loss. With RTNs, state-of-the-art performance is attained on several benchmarks for semantic correspondence.

Original languageEnglish
Pages (from-to)6126-6136
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2018-December
Publication statusPublished - 2018 Jan 1
Event32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada
Duration: 2018 Dec 22018 Dec 8

Fingerprint

Chemical activation
Semantics

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Kim, Seungryong ; Lin, Stephen ; Jeon, Sangryul ; Min, Dongbo ; Sohn, Kwanghoon. / Recurrent transformer networks for semantic correspondence. In: Advances in Neural Information Processing Systems. 2018 ; Vol. 2018-December. pp. 6126-6136.
@article{c1988b4a49364feda5b3146d38423d07,
title = "Recurrent transformer networks for semantic correspondence",
abstract = "We present recurrent transformer networks (RTNs) for obtaining dense correspondences between semantically similar images. Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations. By directly estimating the transformations between an image pair, rather than employing spatial transformer networks to independently normalize each individual image, we show that greater accuracy can be achieved. This process is conducted in a recursive manner to refine both the transformation estimates and the feature representations. In addition, a technique is presented for weakly-supervised training of RTNs that is based on a proposed classification loss. With RTNs, state-of-the-art performance is attained on several benchmarks for semantic correspondence.",
author = "Seungryong Kim and Stephen Lin and Sangryul Jeon and Dongbo Min and Kwanghoon Sohn",
year = "2018",
month = "1",
day = "1",
language = "English",
volume = "2018-December",
pages = "6126--6136",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

Kim, S, Lin, S, Jeon, S, Min, D & Sohn, K 2018, 'Recurrent transformer networks for semantic correspondence', Advances in Neural Information Processing Systems, vol. 2018-December, pp. 6126-6136.

Recurrent transformer networks for semantic correspondence. / Kim, Seungryong; Lin, Stephen; Jeon, Sangryul; Min, Dongbo; Sohn, Kwanghoon.

In: Advances in Neural Information Processing Systems, Vol. 2018-December, 01.01.2018, p. 6126-6136.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Recurrent transformer networks for semantic correspondence

AU - Kim, Seungryong

AU - Lin, Stephen

AU - Jeon, Sangryul

AU - Min, Dongbo

AU - Sohn, Kwanghoon

PY - 2018/1/1

Y1 - 2018/1/1

N2 - We present recurrent transformer networks (RTNs) for obtaining dense correspondences between semantically similar images. Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations. By directly estimating the transformations between an image pair, rather than employing spatial transformer networks to independently normalize each individual image, we show that greater accuracy can be achieved. This process is conducted in a recursive manner to refine both the transformation estimates and the feature representations. In addition, a technique is presented for weakly-supervised training of RTNs that is based on a proposed classification loss. With RTNs, state-of-the-art performance is attained on several benchmarks for semantic correspondence.

AB - We present recurrent transformer networks (RTNs) for obtaining dense correspondences between semantically similar images. Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations. By directly estimating the transformations between an image pair, rather than employing spatial transformer networks to independently normalize each individual image, we show that greater accuracy can be achieved. This process is conducted in a recursive manner to refine both the transformation estimates and the feature representations. In addition, a technique is presented for weakly-supervised training of RTNs that is based on a proposed classification loss. With RTNs, state-of-the-art performance is attained on several benchmarks for semantic correspondence.

UR - http://www.scopus.com/inward/record.url?scp=85064841762&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064841762&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85064841762

VL - 2018-December

SP - 6126

EP - 6136

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -