Speaker-invariant Psychological Stress Detection Using Attention-based Network

Hyeon Kyeong Shin, Hyewon Han, Kyungguen Byun, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

When people get stressed in nervous or unfamiliar situations, their speaking styles or acoustic characteristics change. These changes are particularly emphasized in certain regions of speech, so a model that automatically computes temporal weights for components of the speech signals that reflect stress-related information can effectively capture the psychological state of the speaker. In this paper, we propose an algorithm for psychological stress detection from speech signals using a deep spectral-temporal encoder and multi-head attention with domain adversarial training. To detect long-term variations and spectral relations in the speech under different stress conditions, we build a network by concatenating a convolutional neural network (CNN) and a recurrent neural network (RNN). Then, multi-head attention is utilized to further emphasize stress-concentrated regions. For speaker-invariant stress detection, the network is trained with adversarial multi-task learning by adding a gradient reversal layer. We show the robustness of our proposed algorithm in stress classification tasks on the Multimodal Korean stress database acquired in [1] and the authorized stress database Speech Under Simulated and Actual Stress (SUSAS) [2]. In addition, we demonstrate the effectiveness of multi-head attention and domain adversarial training with visualized analysis using the t-SNE method.

Original languageEnglish
Title of host publication2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages308-313
Number of pages6
ISBN (Electronic)9789881476883
Publication statusPublished - 2020 Dec 7
Event2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Virtual, Auckland, New Zealand
Duration: 2020 Dec 72020 Dec 10

Publication series

Name2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings

Conference

Conference2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020
Country/TerritoryNew Zealand
CityVirtual, Auckland
Period20/12/720/12/10

Bibliographical note

Publisher Copyright:
© 2020 APSIPA.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Signal Processing
  • Decision Sciences (miscellaneous)
  • Instrumentation

Fingerprint

Dive into the research topics of 'Speaker-invariant Psychological Stress Detection Using Attention-based Network'. Together they form a unique fingerprint.

Cite this