Proactive proximity monitoring with instance segmentation and unmanned aerial vehicle-acquired video-frame prediction

Seongdeok Bang, Yeji Hong, Hyoungkwan Kim

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)


To prevent struck-by accidents at construction sites, construction object movement should be predicted to avoid dangerous situations. This paper proposes a vision-based proactive proximity-monitoring method based on predictions of unmanned aerial vehicle (UAV)-acquired video frames. The method has three modules. The first module recognizes workers, excavators, and dump trucks on a pixel level. The second module predicts construction objects’ future locations and postures. The third module generates proactive safety information, including the future direction and speed of objects moving toward the worker. For the evaluation of the method, 1,940 images extracted from nine UAV-acquired videos recorded at real construction sites were used as the dataset. The method recognized construction objects with a mean average precision of 94.32%, predicted the future frame after 1 s with an F-measure of 80.59%, and recorded mean proximity errors of 0.43, 0.91, and 1.22 m for the frames after 1, 2, and 3 s, respectively.

Original languageEnglish
Pages (from-to)800-816
Number of pages17
JournalComputer-Aided Civil and Infrastructure Engineering
Issue number6
Publication statusPublished - 2021 Jun

Bibliographical note

Funding Information:
This work was supported by National Research Foundation of Korea (NRF) grants funded by the Ministry of Science and ICT (No. 2018R1A2B2008600) and the Ministry of Education (No. 2018R1A6A1A08025348).

Publisher Copyright:
© 2021 Computer-Aided Civil and Infrastructure Engineering

All Science Journal Classification (ASJC) codes

  • Civil and Structural Engineering
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design
  • Computational Theory and Mathematics


Dive into the research topics of 'Proactive proximity monitoring with instance segmentation and unmanned aerial vehicle-acquired video-frame prediction'. Together they form a unique fingerprint.

Cite this