Abstract
Semi-supervised video object segmentation (VOS) aims to densely track certain designated objects in videos. One of the main challenges in this task is the existence of background distractors that appear similar to the target objects. We propose three novel strategies to suppress such distractors: 1) a spatio-temporally diversified template construction scheme to obtain generalized properties of the target objects; 2) a learnable distance-scoring function to exclude spatially-distant distractors by exploiting the temporal consistency between two consecutive frames; 3) swap-and-attach augmentation to force each object to have unique features by providing training samples containing entangled objects. On all public benchmark datasets, our model achieves a comparable performance to contemporary state-of-the-art approaches, even with real-time performance. Qualitative results also demonstrate the superiority of our approach over existing methods. We believe our approach will be widely used for future VOS research. Code and models are available at https://github.com/suhwan-cho/TBD.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 - 17th European Conference, Proceedings |
Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 446-462 |
Number of pages | 17 |
ISBN (Print) | 9783031200465 |
DOIs | |
Publication status | Published - 2022 |
Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: 2022 Oct 23 → 2022 Oct 27 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13682 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th European Conference on Computer Vision, ECCV 2022 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 22/10/23 → 22/10/27 |
Bibliographical note
Funding Information:Acknowledgement. This research was supported by R&D program for Advanced Integrated-intelligence for Identification (AIID) through the National Research Foundation of KOREA (NRF) funded by Ministry of Science and ICT (NRF-2018M3 E3A1057289) and the KIST Institutional Program (Project No.2E31051-21-203).
Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)