In this paper, we propose a new recurrent neural network (RNN)-based single-channel speech enhancement framework for off-line wind noise reduction. To adequately represent highly non-stationary characteristics of wind noise, we first adopt a deep bi-directional long short-term memory (DBLSTM) structure. However, its enhanced output becomes muffled due to the spectral over-smoothing effect. To overcome this problem, we propose a new structure of DBLSTM-based speech enhancement system that internally incorporates the speech and noise power estimation processes in the spectral filtering framework. Furthermore, we propose a structure with an additional internal constraint of minimizing log a priori SNR, which provides efficient learning mechanism. Experimental results show that the proposed method improves source-to-distortion ratio (SDR) by 6.9 dB and perceptual evaluation of speech quality (PESQ) by 0.24 in comparison to the conventional DBLSTM-based system.
|Title of host publication||2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|Publication status||Published - 2017 Apr 10|
|Event||2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - San Francisco, United States|
Duration: 2017 Mar 1 → 2017 Mar 3
|Name||2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings|
|Other||2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017|
|Period||17/3/1 → 17/3/3|
Bibliographical notePublisher Copyright:
© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Networks and Communications
- Acoustics and Ultrasonics