Weakly supervised object localization (WSOL) methods utilize the internal feature responses of a classifier trained only on image-level labels. Classifiers tend to focus on the most discriminative part of the target object, instead of considering its full extent. Adversarial erasing (AE) techniques have been proposed to ameliorate this problem. These techniques erase the most discriminative part during training, thereby encouraging the classifiers to learn the less discriminative parts of the object. Despite the success of AE-based methods, we have observed that the hyperparameters fail to generalize across model architectures and datasets. Therefore, new sets of hyperparameters must be determined for each architecture and dataset. The selection of hyperparameters frequently requires strong supervision (e.g., pixel-level annotations or human inspection). Because WSOL is premised on the assumption that such strong supervision is absent, the applicability of AE-based methods is limited. In this paper, we propose the region-based dropout with attention prior (RDAP) algorithm, which features hyperparameter transferability. We combined AE with regional dropout algorithms that provide greater stability against the selection of hyperparameters. We empirically confirmed that the RDAP method achieved state-of-the-art localization accuracy on four architectures, namely VGG-GAP, InceptionV3, ResNet-50 SE, and PreResNet-18, and two datasets, namely CUB-200-2011 and ImageNet-1k, with a single set of hyperparameters.
|Publication status||Published - 2021 Aug|
Bibliographical notePublisher Copyright:
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence