Speaker dependent emotion recognition using speech signals

Bong Seok Kang, Chul Hee Han, Sang Tae Lee, Dae Hee Youn, Chungyong Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Citations (Scopus)

Abstract

This paper examines three algorithms to recognize speaker's emotion using the speech signals. Target emotions are happiness, sadness, anger, fear, boredom and neutral state. MLB(Maximum-Likelihood Bayes), NN(Nearest Neighbor) and HMM(Hidden Markov Model) algorithms are used as the pattern matching techniques. In all cases, pitch and energy are used as the features. The feature vectors for MLB and NN are composed of pitch mean, pitch standard deviation, energy mean, energy standard deviation, etc. For HMM, vectors of delta pitch with delta-delta pitch and delta energy with delta-delta energy are used. A corpus of emotional speech data was recorded and the subjective evaluation of the data was performed by 23 untrained listeners. The subjective recognition result was 56% and was compared with the classifiers' recognition rates. MLB, NN, and HMM classifiers achieved recognition rates of 68.9%, 69.3%, and 89.1%, respectively, for the speaker dependent and context-independent classification.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
Publication statusPublished - 2000 Jan 1
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 2000 Oct 162000 Oct 20

Other

Other6th International Conference on Spoken Language Processing, ICSLP 2000
CountryChina
CityBeijing
Period00/10/1600/10/20

Fingerprint

emotion
energy
boredom
happiness
anger
listener
Energy
Emotion Recognition
anxiety
evaluation
Maximum Likelihood
Emotion
Nearest Neighbor
Hidden Markov Model
Classifier
Deviation

All Science Journal Classification (ASJC) codes

  • Linguistics and Language
  • Language and Linguistics

Cite this

Kang, B. S., Han, C. H., Lee, S. T., Youn, D. H., & Lee, C. (2000). Speaker dependent emotion recognition using speech signals. In 6th International Conference on Spoken Language Processing, ICSLP 2000 International Speech Communication Association.
Kang, Bong Seok ; Han, Chul Hee ; Lee, Sang Tae ; Youn, Dae Hee ; Lee, Chungyong. / Speaker dependent emotion recognition using speech signals. 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 2000.
@inproceedings{5f27d5e0df984d3287e43cece45d335c,
title = "Speaker dependent emotion recognition using speech signals",
abstract = "This paper examines three algorithms to recognize speaker's emotion using the speech signals. Target emotions are happiness, sadness, anger, fear, boredom and neutral state. MLB(Maximum-Likelihood Bayes), NN(Nearest Neighbor) and HMM(Hidden Markov Model) algorithms are used as the pattern matching techniques. In all cases, pitch and energy are used as the features. The feature vectors for MLB and NN are composed of pitch mean, pitch standard deviation, energy mean, energy standard deviation, etc. For HMM, vectors of delta pitch with delta-delta pitch and delta energy with delta-delta energy are used. A corpus of emotional speech data was recorded and the subjective evaluation of the data was performed by 23 untrained listeners. The subjective recognition result was 56{\%} and was compared with the classifiers' recognition rates. MLB, NN, and HMM classifiers achieved recognition rates of 68.9{\%}, 69.3{\%}, and 89.1{\%}, respectively, for the speaker dependent and context-independent classification.",
author = "Kang, {Bong Seok} and Han, {Chul Hee} and Lee, {Sang Tae} and Youn, {Dae Hee} and Chungyong Lee",
year = "2000",
month = "1",
day = "1",
language = "English",
booktitle = "6th International Conference on Spoken Language Processing, ICSLP 2000",
publisher = "International Speech Communication Association",

}

Kang, BS, Han, CH, Lee, ST, Youn, DH & Lee, C 2000, Speaker dependent emotion recognition using speech signals. in 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 6th International Conference on Spoken Language Processing, ICSLP 2000, Beijing, China, 00/10/16.

Speaker dependent emotion recognition using speech signals. / Kang, Bong Seok; Han, Chul Hee; Lee, Sang Tae; Youn, Dae Hee; Lee, Chungyong.

6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 2000.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Speaker dependent emotion recognition using speech signals

AU - Kang, Bong Seok

AU - Han, Chul Hee

AU - Lee, Sang Tae

AU - Youn, Dae Hee

AU - Lee, Chungyong

PY - 2000/1/1

Y1 - 2000/1/1

N2 - This paper examines three algorithms to recognize speaker's emotion using the speech signals. Target emotions are happiness, sadness, anger, fear, boredom and neutral state. MLB(Maximum-Likelihood Bayes), NN(Nearest Neighbor) and HMM(Hidden Markov Model) algorithms are used as the pattern matching techniques. In all cases, pitch and energy are used as the features. The feature vectors for MLB and NN are composed of pitch mean, pitch standard deviation, energy mean, energy standard deviation, etc. For HMM, vectors of delta pitch with delta-delta pitch and delta energy with delta-delta energy are used. A corpus of emotional speech data was recorded and the subjective evaluation of the data was performed by 23 untrained listeners. The subjective recognition result was 56% and was compared with the classifiers' recognition rates. MLB, NN, and HMM classifiers achieved recognition rates of 68.9%, 69.3%, and 89.1%, respectively, for the speaker dependent and context-independent classification.

AB - This paper examines three algorithms to recognize speaker's emotion using the speech signals. Target emotions are happiness, sadness, anger, fear, boredom and neutral state. MLB(Maximum-Likelihood Bayes), NN(Nearest Neighbor) and HMM(Hidden Markov Model) algorithms are used as the pattern matching techniques. In all cases, pitch and energy are used as the features. The feature vectors for MLB and NN are composed of pitch mean, pitch standard deviation, energy mean, energy standard deviation, etc. For HMM, vectors of delta pitch with delta-delta pitch and delta energy with delta-delta energy are used. A corpus of emotional speech data was recorded and the subjective evaluation of the data was performed by 23 untrained listeners. The subjective recognition result was 56% and was compared with the classifiers' recognition rates. MLB, NN, and HMM classifiers achieved recognition rates of 68.9%, 69.3%, and 89.1%, respectively, for the speaker dependent and context-independent classification.

UR - http://www.scopus.com/inward/record.url?scp=85009083417&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009083417&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85009083417

BT - 6th International Conference on Spoken Language Processing, ICSLP 2000

PB - International Speech Communication Association

ER -

Kang BS, Han CH, Lee ST, Youn DH, Lee C. Speaker dependent emotion recognition using speech signals. In 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association. 2000