In this paper, a novel spatial-temporal locality is proposed and unified via a discriminative dictionary learning framework for visual tracking. By exploring the strong local correlations between temporally obtained target and their spatially distributed nearby background neighbors, a spatial-temporal locality is obtained. The locality is formulated as a subspace model and exploited under a unified structure of discriminative dictionary learning with a subspace structure. Using the learned dictionary, the target and its background can be described and distinguished effectively through their sparse codes. As a result, the target is localized by integrating both the descriptive and the discriminative qualities. Extensive experiments on various challenging video sequences demonstrate the superior performance of proposed algorithm over the other state-of-the-art approaches.
Bibliographical noteFunding Information:
Manuscript received October 7, 2016; revised February 19, 2017, June 19, 2017, and September 28, 2017; accepted November 23, 2017. Date of publication December 4, 2017; date of current version December 27, 2017. This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61132007, in part by the Kansas NASA EPSCoR Program under Grant KNEP-PDG-10-2017-KU, in part by the General Research Fund of the University of Kansas under Grant 2228901, in part by the Joint Fund of Civil Aviation Research by NSFC and Civil Aviation Administration under Grant U1533132, in part by the NSF CAREER under Grant 1149783, and in part by Gifts from Adobe and Nvidia. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Junsong Yuan. (Corresponding Author: Yao Sui.) Y. Sui is with the Harvard Medical School, Harvard University, Boston, MA 02115 USA (e-mail: email@example.com).
© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design