Spatial-Channel Transformer for Scene Recognition

Seunghyun Baik, Hongje Seong, Youngjo Lee, Euntai Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Despite the great success of attention mechanisms on object recognition, scene recognition remains a challenging problem. The reason is that discriminative regions are not evident in a scene image. For example, a tree in an image can be a cue to recognize a scene, but the tree cannot be the only cue for recognizing the scene. That means several scene categories (e.g. mountain, marsh, and river) can contain a tree. Thus sometimes, overall regions, rather than specific regions, need to be considered for scene recognition. To solve the problem, we propose Spatial-Channel Transformer (SC-Transformer). The SC-Transformer is a simple yet effective module that uses a new attention mechanism by incorporating the importance between the spatial and the channel domain for a given scene image. If the given scene image should be considered only within some specific regions, SC-Transformer turns off the channel attention, and vice versa. Furthermore, the attention mechanism used in our proposed method is advanced from previous approaches. Previous spatial and channel attention mechanisms were designed in a sequential or parallel manner. These mechanisms eventually combine spatial and channel attention together, so spatial and channel attention may often interfere with each other. In contrast to the previous works, we present a new mechanism that simultaneously considers spatial and channel attentions. We validate our approach on a large-scale scene recognition dataset and outperform the previous state-of-the-art spatial-channel attention mechanism. Experimental results demonstrate the efficacy of our attention mechanism for scene recognition.

Original languageEnglish
Title of host publication2022 International Joint Conference on Neural Networks, IJCNN 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728186719
DOIs
Publication statusPublished - 2022
Event2022 International Joint Conference on Neural Networks, IJCNN 2022 - Padua, Italy
Duration: 2022 Jul 182022 Jul 23

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2022-July

Conference

Conference2022 International Joint Conference on Neural Networks, IJCNN 2022
Country/TerritoryItaly
CityPadua
Period22/7/1822/7/23

Bibliographical note

Funding Information:
ACKNOWLEDGEMENT This research was supported in part by the KIST Institutional Program(Project No. 2E31051-21-204). This research was also supported in part by the Yonsei Signature Research Cluster Program of 2022 (2022-22-0002).

Publisher Copyright:
© 2022 IEEE.

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Spatial-Channel Transformer for Scene Recognition'. Together they form a unique fingerprint.

Cite this