Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement

Jinyoung Lee, Hong Goo Kang

Research output: Contribution to journalArticlepeer-review

Abstract

In this letter, we propose a two-stage network for performing speech enhancement that predicts magnitude spectra in the first stage and complex spectra in the second stage. To maximize the model's performance at each stage, we propose two convolutional modules: magnitude spectral masking (MSM) and complex spectra refinement (CSR). Each module is designed to take into account the specific characteristics of the signal type it handles. The MSM estimates multiplicative masks to remove noise in the magnitude component of the convolutional features, and the CSR refines the complex component of the convolutional features using additive features. By using these modules, our proposed two-stage enhancement model shows higher performance than previously proposed state-of-the-art algorithms. In addition, the number of parameters of our model is only 2.63 million, and it can operate in real time thanks to its causal characteristics and low computational complexity.

Original languageEnglish
Pages (from-to)2188-2192
Number of pages5
JournalIEEE Signal Processing Letters
Volume29
DOIs
Publication statusPublished - 2022

Bibliographical note

Publisher Copyright:
© 1994-2012 IEEE.

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement'. Together they form a unique fingerprint.

Cite this