Facial landmark is one of the most basic elements for obtaining facial information such as facial expression and emotion. However, detecting dense landmarks on an image is challenging due to various facial poses. In this paper, a deep architecture for dense facial landmark detection, called ConcatNet, is proposed. In our architecture, we propose a CNN-based dense landmark detector on part regions of a face, which extends a given set of sparse landmarks to more accurate and dense landmarks. By introducing interface layers for coordinate normalization and part region localization, we concatenate a network for sparse landmark detection to ConcatNet in a global-to-local manner and the whole network to operate in an end-to-end manner. The experimental results on LFW and 300W datasets show that ConcatNet not only expands the number of the sparse landmarks but also increases the accuracy of the landmark positions remarkably. Also, ConcatNet shows high accuracy in detecting the dense landmarks with a smaller dataset and without additional data on an image such as 3D position annotations when compared to 3D model-based detection method.
|Title of host publication||2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings|
|Publisher||IEEE Computer Society|
|Number of pages||5|
|Publication status||Published - 2018 Aug 29|
|Event||25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece|
Duration: 2018 Oct 7 → 2018 Oct 10
|Name||Proceedings - International Conference on Image Processing, ICIP|
|Conference||25th IEEE International Conference on Image Processing, ICIP 2018|
|Period||18/10/7 → 18/10/10|
Bibliographical noteFunding Information:
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2016R1A2B2014525)
© 2018 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Signal Processing