A Study on Conditional Features for a Flow-based Neural Vocoder

Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose an effective way of providing conditional features for a flow-based neural vocoder. Most conventional approaches utilize mel-spectrograms for conditioning neural vocoders, but this significantly increases the size of neural networks due to their high dimensional behavior. We show that the network size of a flow-based generative model can be reduced when we use acoustic parameters for a sinusoidal speech analysis-and-synthesis framework such as voiced/unvoiced flag, fundamental frequency, mel-cepstral coefficients, and energy of each analysis frame. We also conclude that training becomes much easier if we feed the fundamental frequency by an embedded vector representation after quantizing it with a small number of bits. Experimental results verify that the performance of the proposed algorithm is comparable to that of flow-based neural vocoders conditioned on mel-spectrograms while the required information for the feature representations and network complexity for generating speech become lower.

Original languageEnglish
Title of host publicationConference Record of the 54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020
EditorsMichael B. Matthews
PublisherIEEE Computer Society
Pages662-666
Number of pages5
ISBN (Electronic)9780738131269
DOIs
Publication statusPublished - 2020 Nov 1
Event54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020 - Pacific Grove, United States
Duration: 2020 Nov 12020 Nov 5

Publication series

NameConference Record - Asilomar Conference on Signals, Systems and Computers
Volume2020-November
ISSN (Print)1058-6393

Conference

Conference54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020
Country/TerritoryUnited States
CityPacific Grove
Period20/11/120/11/5

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'A Study on Conditional Features for a Flow-based Neural Vocoder'. Together they form a unique fingerprint.

Cite this