We present a novel human body co-segmentation method for unregistered multispectral, color and thermal, images by leveraging CNNs. The main challenges for that tasks are no-alignment between color and thermal images and an absent of ground truth human segmentation labels. To solve these limitations, our key-insight is to formulate the segmentation network for each modality that solve two sub-tasks, correspondence and classification, in a joint and iterative manner. We formulate the learning framework between multispectral images in a way that training labels for one modality are used to learn the network for the other modality. We estimate dense correspondences between multispectral image pairs using intermediate convolutional activations of CNNs and perform human segmentation for each modality through the conditional random fields (CRF) optimization using unary and pairwise fusion. These two steps are formulated as an iterative framework, enables the network to converge on an optimal solution. Experimental results show that our proposed method outperforms conventional state-of-the-art methods on the VAP benchmark consisting of unregistered multispectral color and thermal images.
|Title of host publication||2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings|
|Publisher||IEEE Computer Society|
|Number of pages||5|
|Publication status||Published - 2018 Feb 20|
|Event||24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China|
Duration: 2017 Sep 17 → 2017 Sep 20
|Name||Proceedings - International Conference on Image Processing, ICIP|
|Other||24th IEEE International Conference on Image Processing, ICIP 2017|
|Period||17/9/17 → 17/9/20|
Bibliographical noteFunding Information:
We proposed the human body co-segmentation method for unregistered multi-spectral color and thermal image by leveraging CNNs. We formulated the human co-segmentation problem as two sub-problems, dense correspondence and segmentation, in a joint and iterative manner. We formulated the learning framework such that training labels for one modality are gradually updated as an iteration progress to learn the other modality network. To realize that task, dense correspondences are estimated using intermediate convolutional activations of segmentation network, and with this flow field, the human body segmentation labels are fused in a unary and pairwise term within CRF optimization. Experimental results demonstrated that our method provides highly accurate segmentation results in comparison to state-of-the-art methods even without pixel-level human annotations. 5. ACKNOWLEDGEMENTS This work was supported by Institute for Information communications Technology Promotion(IITP) grant funded by the Korea government(MSIP)(No.2016-0-00197)
© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Signal Processing