Abstract
In this paper, we propose an effective way of providing conditional features for a flow-based neural vocoder. Most conventional approaches utilize mel-spectrograms for conditioning neural vocoders, but this significantly increases the size of neural networks due to their high dimensional behavior. We show that the network size of a flow-based generative model can be reduced when we use acoustic parameters for a sinusoidal speech analysis-and-synthesis framework such as voiced/unvoiced flag, fundamental frequency, mel-cepstral coefficients, and energy of each analysis frame. We also conclude that training becomes much easier if we feed the fundamental frequency by an embedded vector representation after quantizing it with a small number of bits. Experimental results verify that the performance of the proposed algorithm is comparable to that of flow-based neural vocoders conditioned on mel-spectrograms while the required information for the feature representations and network complexity for generating speech become lower.
Original language | English |
---|---|
Title of host publication | Conference Record of the 54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020 |
Editors | Michael B. Matthews |
Publisher | IEEE Computer Society |
Pages | 662-666 |
Number of pages | 5 |
ISBN (Electronic) | 9780738131269 |
DOIs | |
Publication status | Published - 2020 Nov 1 |
Event | 54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020 - Pacific Grove, United States Duration: 2020 Nov 1 → 2020 Nov 5 |
Publication series
Name | Conference Record - Asilomar Conference on Signals, Systems and Computers |
---|---|
Volume | 2020-November |
ISSN (Print) | 1058-6393 |
Conference
Conference | 54th Asilomar Conference on Signals, Systems and Computers, ACSSC 2020 |
---|---|
Country/Territory | United States |
City | Pacific Grove |
Period | 20/11/1 → 20/11/5 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Networks and Communications