Learning Spatial-Aware Regressions for Visual Tracking

Chong Sun, Dong Wang, Huchuan Lu, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

47 Citations (Scopus)

Abstract

In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on two benchmark datasets validate the effectiveness of the proposed method.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PublisherIEEE Computer Society
Pages8962-8970
Number of pages9
ISBN (Electronic)9781538664209
DOIs
Publication statusPublished - 2018 Dec 14
Event31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States
Duration: 2018 Jun 182018 Jun 22

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
CountryUnited States
CitySalt Lake City
Period18/6/1818/6/22

Fingerprint

Neural networks
Convolution

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Sun, C., Wang, D., Lu, H., & Yang, M. H. (2018). Learning Spatial-Aware Regressions for Visual Tracking. In Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 (pp. 8962-8970). [8579032] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00934
Sun, Chong ; Wang, Dong ; Lu, Huchuan ; Yang, Ming Hsuan. / Learning Spatial-Aware Regressions for Visual Tracking. Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. pp. 8962-8970 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
@inproceedings{cf02d461515d41f8b094b1d57014544f,
title = "Learning Spatial-Aware Regressions for Visual Tracking",
abstract = "In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on two benchmark datasets validate the effectiveness of the proposed method.",
author = "Chong Sun and Dong Wang and Huchuan Lu and Yang, {Ming Hsuan}",
year = "2018",
month = "12",
day = "14",
doi = "10.1109/CVPR.2018.00934",
language = "English",
series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
publisher = "IEEE Computer Society",
pages = "8962--8970",
booktitle = "Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018",
address = "United States",

}

Sun, C, Wang, D, Lu, H & Yang, MH 2018, Learning Spatial-Aware Regressions for Visual Tracking. in Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018., 8579032, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, pp. 8962-8970, 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, United States, 18/6/18. https://doi.org/10.1109/CVPR.2018.00934

Learning Spatial-Aware Regressions for Visual Tracking. / Sun, Chong; Wang, Dong; Lu, Huchuan; Yang, Ming Hsuan.

Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. p. 8962-8970 8579032 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Learning Spatial-Aware Regressions for Visual Tracking

AU - Sun, Chong

AU - Wang, Dong

AU - Lu, Huchuan

AU - Yang, Ming Hsuan

PY - 2018/12/14

Y1 - 2018/12/14

N2 - In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on two benchmark datasets validate the effectiveness of the proposed method.

AB - In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on two benchmark datasets validate the effectiveness of the proposed method.

UR - http://www.scopus.com/inward/record.url?scp=85061751032&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061751032&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2018.00934

DO - 10.1109/CVPR.2018.00934

M3 - Conference contribution

AN - SCOPUS:85061751032

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 8962

EP - 8970

BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018

PB - IEEE Computer Society

ER -

Sun C, Wang D, Lu H, Yang MH. Learning Spatial-Aware Regressions for Visual Tracking. In Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society. 2018. p. 8962-8970. 8579032. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). https://doi.org/10.1109/CVPR.2018.00934