Abstract
We present a novel framework for contrastive learning of pixel-level representation using only unlabeled video. Without the need of ground-truth annotation, our method is capable of collecting well-defined positive correspondences by measuring their confidences and well-defined negative ones by appropriately adjusting their hardness during training. This allows us to suppress the adverse impact of ambiguous matches and prevent a trivial solution from being yielded by too hard or too easy negative samples. To accomplish this, we incorporate three different criteria that ranges from a pixel-level matching confidence to a video-level one into a bottom-up pipeline, and plan a curriculum that is aware of current representation power for the adaptive hardness of negative samples during training. With the proposed method, state-of-the-art performance is attained over the latest approaches on several video label propagation tasks.
Original language | English |
---|---|
Title of host publication | Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 |
Publisher | IEEE Computer Society |
Pages | 1034-1044 |
Number of pages | 11 |
ISBN (Electronic) | 9781665445092 |
DOIs | |
Publication status | Published - 2021 |
Event | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, United States Duration: 2021 Jun 19 → 2021 Jun 25 |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
Conference
Conference | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 21/6/19 → 21/6/25 |
Bibliographical note
Funding Information:Acknowledgements : This work was supported by IITP grant funded by the Korea government (MSIT) (No.2020-0-00056, To create AI systems that act appropriately and effectively in novel situations that occur in open worlds) and the Yonsei University Research Fund of 2021 (2021-22-0001).
Publisher Copyright:
© 2021 IEEE
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition