Existing deep trackers mainly use convolutional neural networks pre-trained for the generic object recognition task for representations. Despite demonstrated successes for numerous vision tasks, the contributions of using pre-trained deep features for visual tracking are not as significant as that for object recognition. The key issue is that in visual tracking the targets of interest can be arbitrary object class with arbitrary forms. As such, pre-trained deep features are less effective in modeling these targets of arbitrary forms for distinguishing them from the background. In this paper, we propose a novel scheme to learn target-aware features, which can better recognize the targets undergoing significant appearance variations than pre-trained deep features. To this end, we develop a regression loss and a ranking loss to guide the generation of target-active and scale-sensitive features. We identify the importance of each convolutional filter according to the back-propagated gradients and select the target-aware features based on activations for representing the targets. The target-aware features are integrated with a Siamese matching network for visual tracking. Extensive experimental results show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of accuracy and speed.
|Title of host publication||Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019|
|Publisher||IEEE Computer Society|
|Number of pages||10|
|Publication status||Published - 2019 Jun|
|Event||32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States|
Duration: 2019 Jun 16 → 2019 Jun 20
|Name||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|Conference||32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019|
|Period||19/6/16 → 19/6/20|
Bibliographical noteFunding Information:
This work is supported in part by the NSFC (No. 61672183), the NSF of Guangdong Province (No. 2015A030313544), the Shenzhen Research Council (No. JCYJ20170413104556946, J-CYJ20170815113552036, JCYJ20160226201453085), the Shen-zhen Medical Biometrics Perception and Analysis Engineering Laboratory, the National Key Research and Development Program of China (2016YFB1001003), STCSM (18DZ1112300), the NSF CAREER Grant No.1149783, and gifts from Adobe, Verisk, and NEC. Xin Li is supported by a scholarship from China Scholarship Council (CSC).
© 2019 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition