ADVERSARIAL AUDIO SYNTHESIS USING A HARMONIC-PERCUSSIVE DISCRIMINATOR

Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a discriminator design scheme for generative adversarial network-based audio signal generation. Unlike conventional discriminators that take an entire signal as input, our discriminator separates the audio signal into harmonic and percussive components and analyzes each component independently. The rationale behind this idea is that conventional discriminators cannot reliably capture subtle distortions in audio signals, which have complicated time-frequency characteristics. By considering the time-frequency resolution of audio signals, our proposed method encourages the generator to better reconstruct harmonic and percussive features, both of which are critical for the quality of the generated signals. Listening tests show that our framework significantly enhances the stability of pitches and generates clearer piano samples compared to a baseline.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages961-965
Number of pages5
ISBN (Electronic)9781665405409
DOIs
Publication statusPublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 2022 May 232022 May 27

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period22/5/2322/5/27

Bibliographical note

Funding Information:
This work was supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government. [21ZH1200, The research of the basic media contents technologies]

Publisher Copyright:
© 2022 IEEE

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'ADVERSARIAL AUDIO SYNTHESIS USING A HARMONIC-PERCUSSIVE DISCRIMINATOR'. Together they form a unique fingerprint.

Cite this