Abstract
The key to solving the fine-grained image recognition is exploring more discriminative features for capturing tiny hints. In particular, the triplet objective function fits well with the fine-grained image recognition task because they capture the semantic similarity between images. However, triplet loss needs many pairs of tuples with hard negative samples, and it takes too much cost. To alleviate this problem, we propose a new framework that generates features of the hard negative samples. The proposed framework consists of three stages: learning part-wise features, enriching refined hard negative samples, and fine-grained image recognition. Our proposed method has achieved state-of-the-art performance in CUB-200-2011, Stanford Cars, FGVC-Aircraft, and DeepFashion datasets. Also, our extensive experiments demonstrate that each stage has a good effect on the final goal.
Original language | English |
---|---|
Pages (from-to) | 374-382 |
Number of pages | 9 |
Journal | Neurocomputing |
Volume | 439 |
DOIs | |
Publication status | Published - 2021 Jun 7 |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2C2003760) and Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)).
Funding Information:
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2C2003760) and Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)).
Publisher Copyright:
© 2020 Elsevier B.V.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Cognitive Neuroscience
- Artificial Intelligence