Recently developed regularization techniques improve the networks generalization by only considering the global context. Therefore, the network tends to focus on a few most discriminative subregions of an image for prediction accuracy, leading the network being sensitive to unseen or noisy data. To address this disadvantage, we introduce the concept of local context mapping by predicting patch-level labels and combine it with a method of local data augmentation by grid-based mixing, called GridMix. Through our analysis of intermediate representations, we show that our GridMix can effectively regularize the network model. Finally, our evaluation results indicate that GridMix outperforms state-of-the-art techniques in classification and adversarial robustness, and it achieves a comparable performance in weakly supervised object localization.
Bibliographical noteFunding Information:
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the MSIP ( NRF-2019R1A2C2006123 , 2020R1A4A1016619 ), and by Institute of Information & Communications Technology Planning & Evaluation ( IITP ) grant funded by the Korea government (MSIT) ( 2020-0-01361 , Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)). This work was also supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC ( Information Technology Research Center ) support program ( IITP-2020-2016-0-00288 ) supervised by the IITP and by IITP grant funded by the Korea government (MSIP) (No. 2018-0-00198 ), Object information extraction and real-to-virtual mapping based AR technology.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence