Traditional techniques for emotion recognition have focused on the facial expression analysis only, thus providing limited ability to encode context that comprehensively represents the emotional responses. We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner. The key idea is to hide human faces in a visual scene and seek other contexts based on an attention mechanism. Our networks consist of two sub-networks, including two-stream encoding networks to separately extract the features of face and context regions, and adaptive fusion networks to fuse such features in an adaptive fashion. We also introduce a novel benchmark for context-aware emotion recognition, called CAER, that is appropriate than existing benchmarks both qualitatively and quantitatively. On several benchmarks, CAER-Net proves the effect of context for emotion recognition. Our dataset is available at http://caer-dataset.github.io.
|Title of host publication||Proceedings - 2019 International Conference on Computer Vision, ICCV 2019|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||10|
|Publication status||Published - 2019 Oct|
|Event||17th IEEE/CVF International Conference on Computer Vision, ICCV 2019 - Seoul, Korea, Republic of|
Duration: 2019 Oct 27 → 2019 Nov 2
|Name||Proceedings of the IEEE International Conference on Computer Vision|
|Conference||17th IEEE/CVF International Conference on Computer Vision, ICCV 2019|
|Country||Korea, Republic of|
|Period||19/10/27 → 19/11/2|
Bibliographical noteFunding Information:
This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science and ICT (NRF-2017M3C4A7069370). ∗Corresponding author
© 2019 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition