ConcatNet: A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment

Hyewon Song, Jiwoo Kang, Sanghoon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Facial landmark is one of the most basic elements for obtaining facial information such as facial expression and emotion. However, detecting dense landmarks on an image is challenging due to various facial poses. In this paper, a deep architecture for dense facial landmark detection, called ConcatNet, is proposed. In our architecture, we propose a CNN-based dense landmark detector on part regions of a face, which extends a given set of sparse landmarks to more accurate and dense landmarks. By introducing interface layers for coordinate normalization and part region localization, we concatenate a network for sparse landmark detection to ConcatNet in a global-to-local manner and the whole network to operate in an end-to-end manner. The experimental results on LFW and 300W datasets show that ConcatNet not only expands the number of the sparse landmarks but also increases the accuracy of the landmark positions remarkably. Also, ConcatNet shows high accuracy in detecting the dense landmarks with a smaller dataset and without additional data on an image such as 3D position annotations when compared to 3D model-based detection method.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings
PublisherIEEE Computer Society
Pages2371-2375
Number of pages5
ISBN (Electronic)9781479970612
DOIs
Publication statusPublished - 2018 Aug 29
Event25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece
Duration: 2018 Oct 72018 Oct 10

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference25th IEEE International Conference on Image Processing, ICIP 2018
CountryGreece
CityAthens
Period18/10/718/10/10

Fingerprint

Detectors

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Song, H., Kang, J., & Lee, S. (2018). ConcatNet: A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment. In 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings (pp. 2371-2375). [8451375] (Proceedings - International Conference on Image Processing, ICIP). IEEE Computer Society. https://doi.org/10.1109/ICIP.2018.8451375
Song, Hyewon ; Kang, Jiwoo ; Lee, Sanghoon. / ConcatNet : A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment. 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society, 2018. pp. 2371-2375 (Proceedings - International Conference on Image Processing, ICIP).
@inproceedings{4c71c92dce3a45a09265433797bde1e8,
title = "ConcatNet: A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment",
abstract = "Facial landmark is one of the most basic elements for obtaining facial information such as facial expression and emotion. However, detecting dense landmarks on an image is challenging due to various facial poses. In this paper, a deep architecture for dense facial landmark detection, called ConcatNet, is proposed. In our architecture, we propose a CNN-based dense landmark detector on part regions of a face, which extends a given set of sparse landmarks to more accurate and dense landmarks. By introducing interface layers for coordinate normalization and part region localization, we concatenate a network for sparse landmark detection to ConcatNet in a global-to-local manner and the whole network to operate in an end-to-end manner. The experimental results on LFW and 300W datasets show that ConcatNet not only expands the number of the sparse landmarks but also increases the accuracy of the landmark positions remarkably. Also, ConcatNet shows high accuracy in detecting the dense landmarks with a smaller dataset and without additional data on an image such as 3D position annotations when compared to 3D model-based detection method.",
author = "Hyewon Song and Jiwoo Kang and Sanghoon Lee",
year = "2018",
month = "8",
day = "29",
doi = "10.1109/ICIP.2018.8451375",
language = "English",
series = "Proceedings - International Conference on Image Processing, ICIP",
publisher = "IEEE Computer Society",
pages = "2371--2375",
booktitle = "2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings",
address = "United States",

}

Song, H, Kang, J & Lee, S 2018, ConcatNet: A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment. in 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings., 8451375, Proceedings - International Conference on Image Processing, ICIP, IEEE Computer Society, pp. 2371-2375, 25th IEEE International Conference on Image Processing, ICIP 2018, Athens, Greece, 18/10/7. https://doi.org/10.1109/ICIP.2018.8451375

ConcatNet : A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment. / Song, Hyewon; Kang, Jiwoo; Lee, Sanghoon.

2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society, 2018. p. 2371-2375 8451375 (Proceedings - International Conference on Image Processing, ICIP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - ConcatNet

T2 - A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment

AU - Song, Hyewon

AU - Kang, Jiwoo

AU - Lee, Sanghoon

PY - 2018/8/29

Y1 - 2018/8/29

N2 - Facial landmark is one of the most basic elements for obtaining facial information such as facial expression and emotion. However, detecting dense landmarks on an image is challenging due to various facial poses. In this paper, a deep architecture for dense facial landmark detection, called ConcatNet, is proposed. In our architecture, we propose a CNN-based dense landmark detector on part regions of a face, which extends a given set of sparse landmarks to more accurate and dense landmarks. By introducing interface layers for coordinate normalization and part region localization, we concatenate a network for sparse landmark detection to ConcatNet in a global-to-local manner and the whole network to operate in an end-to-end manner. The experimental results on LFW and 300W datasets show that ConcatNet not only expands the number of the sparse landmarks but also increases the accuracy of the landmark positions remarkably. Also, ConcatNet shows high accuracy in detecting the dense landmarks with a smaller dataset and without additional data on an image such as 3D position annotations when compared to 3D model-based detection method.

AB - Facial landmark is one of the most basic elements for obtaining facial information such as facial expression and emotion. However, detecting dense landmarks on an image is challenging due to various facial poses. In this paper, a deep architecture for dense facial landmark detection, called ConcatNet, is proposed. In our architecture, we propose a CNN-based dense landmark detector on part regions of a face, which extends a given set of sparse landmarks to more accurate and dense landmarks. By introducing interface layers for coordinate normalization and part region localization, we concatenate a network for sparse landmark detection to ConcatNet in a global-to-local manner and the whole network to operate in an end-to-end manner. The experimental results on LFW and 300W datasets show that ConcatNet not only expands the number of the sparse landmarks but also increases the accuracy of the landmark positions remarkably. Also, ConcatNet shows high accuracy in detecting the dense landmarks with a smaller dataset and without additional data on an image such as 3D position annotations when compared to 3D model-based detection method.

UR - http://www.scopus.com/inward/record.url?scp=85062921579&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062921579&partnerID=8YFLogxK

U2 - 10.1109/ICIP.2018.8451375

DO - 10.1109/ICIP.2018.8451375

M3 - Conference contribution

AN - SCOPUS:85062921579

T3 - Proceedings - International Conference on Image Processing, ICIP

SP - 2371

EP - 2375

BT - 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings

PB - IEEE Computer Society

ER -

Song H, Kang J, Lee S. ConcatNet: A Deep Architecture of Concatenation-Assisted Network for Dense Facial Landmark Alignment. In 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society. 2018. p. 2371-2375. 8451375. (Proceedings - International Conference on Image Processing, ICIP). https://doi.org/10.1109/ICIP.2018.8451375