New feature-level video classification via temporal attention model

Hongje Seong, Suhan Woo, Junhyuk Hyun, Hyunbae Chang, Suhyeon Lee, Euntai Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

CoVieW 2018 is a new challenge which aims at simultaneous scene and action recognition for untrimmed video [1]. In the challenge, frame-level video features extracted by pre-trained deep convolutional neural network (CNN) are provided for video-level classification. In this paper, a new approach for the video-level classification method is proposed. The proposed method focuses on the analysis in temporal domain and the temporal attention model is developed. To compensate for the differences in the lengths of various videos, temporal padding method is also developed to unify the lengths of videos. Further, data augmentation is performed to enhance some validation accuracy. Finally, for the train/validation in CoView 2018 dataset we recorded the performance of 95.53% accuracy in the scene and 87.17% accuracy in the action using temporal attention model, nonzero padding and data augmentation. The top-1 hamming score is the standard metric in the CoVieW 2018 challenge and 91.35% is obtained by the proposed method.

Original languageEnglish
Title of host publicationCoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018
PublisherAssociation for Computing Machinery, Inc
Pages31-34
Number of pages4
ISBN (Electronic)9781450359764
DOIs
Publication statusPublished - 2018 Oct 15
Event1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, CoVieW 2018, in conjunction with ACM Multimedia, MM 2018 - Seoul, Korea, Republic of
Duration: 2018 Oct 22 → …

Other

Other1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, CoVieW 2018, in conjunction with ACM Multimedia, MM 2018
CountryKorea, Republic of
CitySeoul
Period18/10/22 → …

Fingerprint

Neural networks
Datasets

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Health Informatics
  • Media Technology

Cite this

Seong, H., Woo, S., Hyun, J., Chang, H., Lee, S., & Kim, E. (2018). New feature-level video classification via temporal attention model. In CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018 (pp. 31-34). Association for Computing Machinery, Inc. https://doi.org/10.1145/3265987.3265990
Seong, Hongje ; Woo, Suhan ; Hyun, Junhyuk ; Chang, Hyunbae ; Lee, Suhyeon ; Kim, Euntai. / New feature-level video classification via temporal attention model. CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018. Association for Computing Machinery, Inc, 2018. pp. 31-34
@inproceedings{072b165963ae48c1b9128b124d35dc09,
title = "New feature-level video classification via temporal attention model",
abstract = "CoVieW 2018 is a new challenge which aims at simultaneous scene and action recognition for untrimmed video [1]. In the challenge, frame-level video features extracted by pre-trained deep convolutional neural network (CNN) are provided for video-level classification. In this paper, a new approach for the video-level classification method is proposed. The proposed method focuses on the analysis in temporal domain and the temporal attention model is developed. To compensate for the differences in the lengths of various videos, temporal padding method is also developed to unify the lengths of videos. Further, data augmentation is performed to enhance some validation accuracy. Finally, for the train/validation in CoView 2018 dataset we recorded the performance of 95.53{\%} accuracy in the scene and 87.17{\%} accuracy in the action using temporal attention model, nonzero padding and data augmentation. The top-1 hamming score is the standard metric in the CoVieW 2018 challenge and 91.35{\%} is obtained by the proposed method.",
author = "Hongje Seong and Suhan Woo and Junhyuk Hyun and Hyunbae Chang and Suhyeon Lee and Euntai Kim",
year = "2018",
month = "10",
day = "15",
doi = "10.1145/3265987.3265990",
language = "English",
pages = "31--34",
booktitle = "CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018",
publisher = "Association for Computing Machinery, Inc",

}

Seong, H, Woo, S, Hyun, J, Chang, H, Lee, S & Kim, E 2018, New feature-level video classification via temporal attention model. in CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018. Association for Computing Machinery, Inc, pp. 31-34, 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, CoVieW 2018, in conjunction with ACM Multimedia, MM 2018, Seoul, Korea, Republic of, 18/10/22. https://doi.org/10.1145/3265987.3265990

New feature-level video classification via temporal attention model. / Seong, Hongje; Woo, Suhan; Hyun, Junhyuk; Chang, Hyunbae; Lee, Suhyeon; Kim, Euntai.

CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018. Association for Computing Machinery, Inc, 2018. p. 31-34.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - New feature-level video classification via temporal attention model

AU - Seong, Hongje

AU - Woo, Suhan

AU - Hyun, Junhyuk

AU - Chang, Hyunbae

AU - Lee, Suhyeon

AU - Kim, Euntai

PY - 2018/10/15

Y1 - 2018/10/15

N2 - CoVieW 2018 is a new challenge which aims at simultaneous scene and action recognition for untrimmed video [1]. In the challenge, frame-level video features extracted by pre-trained deep convolutional neural network (CNN) are provided for video-level classification. In this paper, a new approach for the video-level classification method is proposed. The proposed method focuses on the analysis in temporal domain and the temporal attention model is developed. To compensate for the differences in the lengths of various videos, temporal padding method is also developed to unify the lengths of videos. Further, data augmentation is performed to enhance some validation accuracy. Finally, for the train/validation in CoView 2018 dataset we recorded the performance of 95.53% accuracy in the scene and 87.17% accuracy in the action using temporal attention model, nonzero padding and data augmentation. The top-1 hamming score is the standard metric in the CoVieW 2018 challenge and 91.35% is obtained by the proposed method.

AB - CoVieW 2018 is a new challenge which aims at simultaneous scene and action recognition for untrimmed video [1]. In the challenge, frame-level video features extracted by pre-trained deep convolutional neural network (CNN) are provided for video-level classification. In this paper, a new approach for the video-level classification method is proposed. The proposed method focuses on the analysis in temporal domain and the temporal attention model is developed. To compensate for the differences in the lengths of various videos, temporal padding method is also developed to unify the lengths of videos. Further, data augmentation is performed to enhance some validation accuracy. Finally, for the train/validation in CoView 2018 dataset we recorded the performance of 95.53% accuracy in the scene and 87.17% accuracy in the action using temporal attention model, nonzero padding and data augmentation. The top-1 hamming score is the standard metric in the CoVieW 2018 challenge and 91.35% is obtained by the proposed method.

UR - http://www.scopus.com/inward/record.url?scp=85058182380&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058182380&partnerID=8YFLogxK

U2 - 10.1145/3265987.3265990

DO - 10.1145/3265987.3265990

M3 - Conference contribution

SP - 31

EP - 34

BT - CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018

PB - Association for Computing Machinery, Inc

ER -

Seong H, Woo S, Hyun J, Chang H, Lee S, Kim E. New feature-level video classification via temporal attention model. In CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018. Association for Computing Machinery, Inc. 2018. p. 31-34 https://doi.org/10.1145/3265987.3265990