Building emotional machines: Recognizing image emotions through deep neural networks

Hye Rin Kim, Yeong Seok Kim, Seon Joo Kim, In Kwon Lee

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

An image is a very effective tool for conveying emotions. Many researchers have investigated emotions in images by using various features extracted from images. In this paper, we focus on two high-level features, the object and the background, and assume that the semantic information in images is a good cue for predicting emotions. An object is one of the most important elements that define an image, and we discover through experiments that there is a high correlation between the objects and emotions in images in most cases. Even with the same object, there may be slight differences in emotion due to different backgrounds, and we use the semantic information of the background to improve the prediction performance. By combining the different levels of features, we build an emotion-based feedforward deep neural network that produces the emotion values of a given image. The output emotion values in our framework are continuous values in two-dimensional space (valence and arousal), which are more effective than using a small number of emotion categories to describe emotions. Experiments confirm the effectiveness of our network in predicting the emotions of images.

Original languageEnglish
Article number8344491
Pages (from-to)2980-2992
Number of pages13
JournalIEEE Transactions on Multimedia
Volume20
Issue number11
DOIs
Publication statusPublished - 2018 Nov

Fingerprint

Semantics
Conveying
Experiments
Deep neural networks

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

@article{7df82e845aa5487cb095203874b56520,
title = "Building emotional machines: Recognizing image emotions through deep neural networks",
abstract = "An image is a very effective tool for conveying emotions. Many researchers have investigated emotions in images by using various features extracted from images. In this paper, we focus on two high-level features, the object and the background, and assume that the semantic information in images is a good cue for predicting emotions. An object is one of the most important elements that define an image, and we discover through experiments that there is a high correlation between the objects and emotions in images in most cases. Even with the same object, there may be slight differences in emotion due to different backgrounds, and we use the semantic information of the background to improve the prediction performance. By combining the different levels of features, we build an emotion-based feedforward deep neural network that produces the emotion values of a given image. The output emotion values in our framework are continuous values in two-dimensional space (valence and arousal), which are more effective than using a small number of emotion categories to describe emotions. Experiments confirm the effectiveness of our network in predicting the emotions of images.",
author = "Kim, {Hye Rin} and Kim, {Yeong Seok} and Kim, {Seon Joo} and Lee, {In Kwon}",
year = "2018",
month = "11",
doi = "10.1109/TMM.2018.2827782",
language = "English",
volume = "20",
pages = "2980--2992",
journal = "IEEE Transactions on Multimedia",
issn = "1520-9210",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "11",

}

Building emotional machines : Recognizing image emotions through deep neural networks. / Kim, Hye Rin; Kim, Yeong Seok; Kim, Seon Joo; Lee, In Kwon.

In: IEEE Transactions on Multimedia, Vol. 20, No. 11, 8344491, 11.2018, p. 2980-2992.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Building emotional machines

T2 - Recognizing image emotions through deep neural networks

AU - Kim, Hye Rin

AU - Kim, Yeong Seok

AU - Kim, Seon Joo

AU - Lee, In Kwon

PY - 2018/11

Y1 - 2018/11

N2 - An image is a very effective tool for conveying emotions. Many researchers have investigated emotions in images by using various features extracted from images. In this paper, we focus on two high-level features, the object and the background, and assume that the semantic information in images is a good cue for predicting emotions. An object is one of the most important elements that define an image, and we discover through experiments that there is a high correlation between the objects and emotions in images in most cases. Even with the same object, there may be slight differences in emotion due to different backgrounds, and we use the semantic information of the background to improve the prediction performance. By combining the different levels of features, we build an emotion-based feedforward deep neural network that produces the emotion values of a given image. The output emotion values in our framework are continuous values in two-dimensional space (valence and arousal), which are more effective than using a small number of emotion categories to describe emotions. Experiments confirm the effectiveness of our network in predicting the emotions of images.

AB - An image is a very effective tool for conveying emotions. Many researchers have investigated emotions in images by using various features extracted from images. In this paper, we focus on two high-level features, the object and the background, and assume that the semantic information in images is a good cue for predicting emotions. An object is one of the most important elements that define an image, and we discover through experiments that there is a high correlation between the objects and emotions in images in most cases. Even with the same object, there may be slight differences in emotion due to different backgrounds, and we use the semantic information of the background to improve the prediction performance. By combining the different levels of features, we build an emotion-based feedforward deep neural network that produces the emotion values of a given image. The output emotion values in our framework are continuous values in two-dimensional space (valence and arousal), which are more effective than using a small number of emotion categories to describe emotions. Experiments confirm the effectiveness of our network in predicting the emotions of images.

UR - http://www.scopus.com/inward/record.url?scp=85045728060&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045728060&partnerID=8YFLogxK

U2 - 10.1109/TMM.2018.2827782

DO - 10.1109/TMM.2018.2827782

M3 - Article

AN - SCOPUS:85045728060

VL - 20

SP - 2980

EP - 2992

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

SN - 1520-9210

IS - 11

M1 - 8344491

ER -