Abstract
Most of the conventional state-of-the-art methods for video analysis achieve outstanding performance by combining two or more different inputs, e.g. an RGB image, a motion image, or an audio signal, in a two-stream manner. Although these approaches generate pronounced performance, it underlines that each considered feature is tantamount in the classification of the video. This dilutes the nature of each class that every class depends on the different levels of information from different features. To incorporate the nature of each class, we present the class nature specific fusion that combines the features with a different level of weights for the optimal class result. In this work, we first represent each frame-level video feature as a spectral image to train convolutional neural networks (CNNs) on the RGB and audio features. We then revise the conventional two-stream fusion method to form a class nature specific one by combining features in different weight for different classes. We evaluate our method on the Comprehensive Video Understanding in the Wild dataset to understand how each class reacted on each feature in wild videos. Our experimental results not only show the advantage over conventional two-stream fusion, but also illustrate the correlation of two features: RGB and audio signal for each class.
Original language | English |
---|---|
Title of host publication | CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018 |
Publisher | Association for Computing Machinery, Inc |
Pages | 27-30 |
Number of pages | 4 |
ISBN (Electronic) | 9781450359764 |
DOIs | |
Publication status | Published - 2018 Oct 15 |
Event | 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, CoVieW 2018, in conjunction with ACM Multimedia, MM 2018 - Seoul, Korea, Republic of Duration: 2018 Oct 22 → … |
Publication series
Name | CoVieW 2018 - Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, co-located with MM 2018 |
---|
Other
Other | 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, CoVieW 2018, in conjunction with ACM Multimedia, MM 2018 |
---|---|
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 18/10/22 → … |
Bibliographical note
Funding Information:This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT (NRF-2017M3C4A7069370)
Publisher Copyright:
© 2018 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Computer Science(all)
- Health Informatics
- Media Technology