In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called tracking by detection has been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrade the classifier and can cause drift. In this paper, we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can therefore lead to a more robust tracker with fewer parameter tweaks. We propose a novel online MIL algorithm for object tracking that achieves superior results with real-time performance. We present thorough experimental results (both qualitative and quantitative) on a number of challenging video clips.
|Number of pages||14|
|Journal||IEEE transactions on pattern analysis and machine intelligence|
|Publication status||Published - 2011|
Bibliographical noteFunding Information:
The authors would like to thank Kristin Branson, Piotr Dollár, David Ross, and the anonymous reviewers for valuable input. This research has been supported by US National Science Foundation (NSF) CAREER Grant #0448615, NSF IGERT Grant DGE-0333451, and US Office of Naval Research Grant #N00014-08-1-0638. Ming-Hsuan Yang is supported in part by a University of California Merced faculty start-up fund and a Google faculty award. Part of this work was performed while Boris Babenko and Ming-Hsuan Yang were at the Honda Research Institute, USA.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Computational Theory and Mathematics
- Artificial Intelligence
- Applied Mathematics