Deep bi-directional long short-term memory based speech enhancement for wind noise reduction

Jinkyu Lee, Keulbit Kim, Turaj Shabestary, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

In this paper, we propose a new recurrent neural network (RNN)-based single-channel speech enhancement framework for off-line wind noise reduction. To adequately represent highly non-stationary characteristics of wind noise, we first adopt a deep bi-directional long short-term memory (DBLSTM) structure. However, its enhanced output becomes muffled due to the spectral over-smoothing effect. To overcome this problem, we propose a new structure of DBLSTM-based speech enhancement system that internally incorporates the speech and noise power estimation processes in the spectral filtering framework. Furthermore, we propose a structure with an additional internal constraint of minimizing log a priori SNR, which provides efficient learning mechanism. Experimental results show that the proposed method improves source-to-distortion ratio (SDR) by 6.9 dB and perceptual evaluation of speech quality (PESQ) by 0.24 in comparison to the conventional DBLSTM-based system.

Original languageEnglish
Title of host publication2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages41-45
Number of pages5
ISBN (Electronic)9781509059256
DOIs
Publication statusPublished - 2017 Apr 10
Event2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - San Francisco, United States
Duration: 2017 Mar 12017 Mar 3

Publication series

Name2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings

Other

Other2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017
CountryUnited States
CitySan Francisco
Period17/3/117/3/3

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Acoustics and Ultrasonics
  • Instrumentation
  • Communication

Fingerprint Dive into the research topics of 'Deep bi-directional long short-term memory based speech enhancement for wind noise reduction'. Together they form a unique fingerprint.

  • Cite this

    Lee, J., Kim, K., Shabestary, T., & Kang, H. G. (2017). Deep bi-directional long short-term memory based speech enhancement for wind noise reduction. In 2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings (pp. 41-45). [7895558] (2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HSCMA.2017.7895558