2PESNet: Towards online processing of temporal action localization

Young Hwi Kim, Seonghyeon Nam, Seon Joo Kim

Research output: Contribution to journalArticlepeer-review

Abstract

Existing online video processing methods such as online action detection focus on a frame-level understanding for high responsiveness. However, it has a fundamental limitation in that it lacks instance-level understanding of videos, making it difficult to be applied to higher-level vision tasks. The instance-level action detection, known as Temporal Action Localization (TAL), have limitations when applying to the online settings. In this work, we introduce a new task that aims to detect action instances of videos in an online setting, named Online Temporal Action Localization (OnTAL). To tackle this problem, we propose a 2-Pass End/Start detection Network (2PESNet) that detects action instances by effectively finding the start and end of an action instance. Additionally, we propose a two-stage action end detection method to further improve the performance. Extensive experiments on THUMOS’14 and ActivityNet v1.3 demonstrate that our model is able to take both accuracy and responsiveness when predicting action instances from streaming videos.

Original languageEnglish
Article number108871
JournalPattern Recognition
Volume131
DOIs
Publication statusPublished - 2022 Nov

Bibliographical note

Funding Information:
This work was conducted by Center for Applied Research in Artificial Intelligence(CARAI) grant funded by Defense Acquisition Program Administration(DAPA) and Agency for Defense Development(ADD) (UD190031RD).

Publisher Copyright:
© 2022

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of '2PESNet: Towards online processing of temporal action localization'. Together they form a unique fingerprint.

Cite this