Abstract
We present Hierarchical Memory Matching Network (HMMN) for semi-supervised video object segmentation. Based on a recent memory-based method [33], we propose two advanced memory read modules that enable us to perform memory reading in multiple scales while exploiting temporal smoothness. We first propose a kernel guided memory matching module that replaces the non-local dense memory read, commonly adopted in previous memory-based methods. The module imposes the temporal smoothness constraint in the memory read, leading to accurate memory retrieval. More importantly, we introduce a hierarchical memory matching scheme and propose a top-k guided memory matching module in which memory read on a fine-scale is guided by that on a coarse-scale. With the module, we perform memory read in multiple scales efficiently and leverage both high-level semantic and low-level fine-grained memory features to predict detailed object masks. Our network achieves state-of-the-art performance on the validation sets of DAVIS 2016/2017 (90.8% and 84.7%) and YouTube-VOS 2018/2019 (82.6% and 82.5%), and test-dev set of DAVIS 2017 (78.6%). The source code and model are available online: https://github.com/Hongje/HMMN.
Original language | English |
---|---|
Title of host publication | Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 12869-12878 |
Number of pages | 10 |
ISBN (Electronic) | 9781665428125 |
DOIs | |
Publication status | Published - 2021 |
Event | 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada Duration: 2021 Oct 11 → 2021 Oct 17 |
Publication series
Name | Proceedings of the IEEE International Conference on Computer Vision |
---|---|
ISSN (Print) | 1550-5499 |
Conference
Conference | 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 |
---|---|
Country/Territory | Canada |
City | Virtual, Online |
Period | 21/10/11 → 21/10/17 |
Bibliographical note
Funding Information:Acknowledgement. This work was supported by the Industry Core Technology Development Project, 20005062, Development of Artificial Intelligence Robot Autonomous Navigation Technology for Agile Movement in Crowded Space, funded by the Ministry of Trade, industry & Energy (MOTIE, Republic of Korea).
Publisher Copyright:
© 2021 IEEE
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition