Despite significant progress in machine learning, pedestrian detection in the real-world is still regarded as one of the challenging problems, limited by occluded appearances, cluttered backgrounds, and bad visibility at night. This has caused detection approaches using multi-spectral sensors such as color and thermal which could be complementary to each other. In this paper, we propose a novel sensor fusion framework for detecting pedestrians even in challenging real-world environments. We design a convolutional neural network (CNN) architecture that consists of three-branch detection models taking different modalities as inputs. Unlike existing methods, we consider all detection probabilities from each modality in a unified CNN framework and selectively use them through a channel weighting fusion (CWF) layer to maximize the detection performance. An accumulated probability fusion (APF) layer is also introduced to combine probabilities from different modalities at the proposal-level. We formulate these sub-networks into a unified network, so that it is possible to train the whole network in an end-to-end manner. Our extensive evaluation demonstrates that the proposed method outperforms the state-of-the-art methods on the challenging KAIST, CVC-14, and DIML multi-spectral pedestrian datasets.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence