Abstract
While single-image object detectors can be naively applied to videos in a frame-by-frame fashion, the prediction is often temporally inconsistent. Moreover, the computation can be redundant since neighboring frames are inherently similar to each other. In this work we propose to improve video object detection via temporal aggregation. Specifically, a detection model is applied on sparse keyframes to handle new objects, occlusions, and rapid motions. We then use real-time trackers to exploit temporal cues and track the detected objects in the remaining frames, which enhances efficiency and temporal coherence. Object status at the bounding-box level is propagated across frames and updated by our aggregation modules. For keyframe scheduling, we propose adaptive policies using reinforcement learning and simple heuristics. The proposed framework achieves the state-of-the-art performance on the Imagenet VID 2015 dataset while running real-time on CPU. Extensive experiments are done to show the effectiveness of our training strategies and justify the model designs.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings |
Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 160-177 |
Number of pages | 18 |
ISBN (Print) | 9783030585679 |
DOIs | |
Publication status | Published - 2020 |
Event | 16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom Duration: 2020 Aug 23 → 2020 Aug 28 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12359 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 16th European Conference on Computer Vision, ECCV 2020 |
---|---|
Country/Territory | United Kingdom |
City | Glasgow |
Period | 20/8/23 → 20/8/28 |
Bibliographical note
Funding Information:This work is supported in part by the NSF CAREER Grant #1149783.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)