In real-world applications for video editing, humans are arguably the most important objects. When editing videos of humans, the efficient tracking of fine-grained masks and body joints is the fundamental requirement. In this paper, we propose a simple and efficient system for jointly tracking pose and segmenting high-quality masks for all humans in the video. We design a pipeline that globally tracks pose and locally segments fine-grained masks. Specifically, CenterTrack is first employed to track human poses by viewing the whole scene, and then the proposed local segmentation network leverages the pose information as a powerful query to carry out high-quality segmentation. Furthermore, we adopt a highly light-weight MLP-Mixer layer within the segmentation network that can efficiently propagate the query pose throughout the region of interest with minimal overhead. For the evaluation, we collect a new benchmark called KineMask which includes various appearances and actions. The experimental results demonstrate that our method has superior fine-grained segmentation performance. Moreover, it runs at 33 fps, achieving a great balance of speed and accuracy compared to the prevailing online Video Instance Segmentation methods.
|Title of host publication||Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022|
|Publisher||IEEE Computer Society|
|Number of pages||10|
|Publication status||Published - 2022|
|Event||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022 - New Orleans, United States|
Duration: 2022 Jun 19 → 2022 Jun 20
|Name||IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops|
|Conference||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022|
|Period||22/6/19 → 22/6/20|
Bibliographical notePublisher Copyright:
© 2022 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering