Cycle-consistent InfoGAN for speech enhancement in various background noises

Wonsup Shin, Sung Bae Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech enhancement is one of the crucial research topics applied to various fields. In addition, due to the progress of wireless communication technology, the need for speech enhancement research to remove various background noise occurring in the real world is increasing. Recently, a speech enhancement model based on generative adversarial learning, which can build a significant loss function by itself, has been proposed and outperformed the conventional methods. However, these models assume parallel datasets for learning, and there is a problem that the performance decreases for the signal containing various kinds of noise. This paper proposes a novel speech enhancement model based on generative adversarial network (GAN). The proposed method additionally uses cycle-consistency loss for learning on non-parallel datasets, where the InfoGAN mechanism is used to cluster noise information in an unsupervised learning manner. The proposed model can form cluster-specific mapping by using the obtained clustering information. We quantitatively verify the speech enhancement performance of the proposed method through several metrics such as MOS, SI-SNR, and PESQ, and achieve about 55% better MOS performance than the previous GAN-based models.

Original languageEnglish
Title of host publicationProceedings - 15th International Conference on Signal Image Technology and Internet Based Systems, SISITS 2019
EditorsKokou Yetongnon, Albert Dipanda, Gabriella Sanniti di Baja, Luigi Gallo, Richard Chbeir
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages203-208
Number of pages6
ISBN (Electronic)9781728156866
DOIs
Publication statusPublished - 2019 Nov
Event15th International Conference on Signal Image Technology and Internet Based Systems, SISITS 2019 - Sorrento, Italy
Duration: 2019 Nov 262019 Nov 29

Publication series

NameProceedings - 15th International Conference on Signal Image Technology and Internet Based Systems, SISITS 2019

Conference

Conference15th International Conference on Signal Image Technology and Internet Based Systems, SISITS 2019
CountryItaly
CitySorrento
Period19/11/2619/11/29

Bibliographical note

Funding Information:
This work was supported by grant funded by 2019 IT promotion fund (Development of AI based Precision Medicine Emergency System) of the Korea government (Ministry of Science and ICT).

Funding Information:
ACKNOWLEDGMENT This work was supported by grant funded by 2019 IT promotion fund (Development of AI based Precision Medicine Emergency System) of the Korea government (Ministry of Science and ICT).

Publisher Copyright:
© 2019 IEEE.

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Signal Processing
  • Media Technology
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Cycle-consistent InfoGAN for speech enhancement in various background noises'. Together they form a unique fingerprint.

Cite this