Abstract
We propose a novel discrete Fourier transform-based pooling layer for convolutional neural networks. The DFT magnitude pooling replaces the traditional max/average pooling layer between the convolution and fully-connected layers to retain translation invariance and shape preserving (aware of shape difference) properties based on the shift theorem of the Fourier transform. Thanks to the ability to handle image misalignment while keeping important structural information in the pooling stage, the DFT magnitude pooling improves the classification accuracy significantly. In addition, we propose the DFT+ method for ensemble networks using the middle convolution layer outputs. The proposed methods are extensively evaluated on various classification tasks using the ImageNet, CUB 2010-2011, MIT Indoors, Caltech 101, FMD and DTD datasets. The AlexNet, VGG-VD 16, Inception-v3, and ResNet are used as the base networks, upon which DFT and DFT+ methods are implemented. Experimental results show that the proposed methods improve the classification performance in all networks and datasets.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings |
Editors | Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss, Martial Hebert |
Publisher | Springer Verlag |
Pages | 89-104 |
Number of pages | 16 |
ISBN (Print) | 9783030012632 |
DOIs | |
Publication status | Published - 2018 |
Event | 15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany Duration: 2018 Sep 8 → 2018 Sep 14 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11218 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Other
Other | 15th European Conference on Computer Vision, ECCV 2018 |
---|---|
Country/Territory | Germany |
City | Munich |
Period | 18/9/8 → 18/9/14 |
Bibliographical note
Funding Information:Acknowledgements. This work was partially supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2017R1A6A3A11031193), Next-Generation Information Computing Development Program through the NRF funded by the Ministry of Science, ICT (NRF-2017M3C4A7069366) and the NSF CAREER Grant #1149783.
Publisher Copyright:
© 2018, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)