Abstract
This paper proposes a cascading deep neural network (DNN) structure for speech synthesis system that consists of text-to-bottleneck (TTB) and bottleneck-to-speech (BTS) models. Unlike conventional single structure that requires a large database to find complicated mapping rules between linguistic and acoustic features, the proposed structure is very effective even if the available training database is inadequate. The bottle-neck feature utilized in the proposed approach represents the characteristics of linguistic features and its average acoustic features of several speakers. Therefore, it is more efficient to learn a mapping rule between bottleneck and acoustic features than to learn directly a mapping rule between linguistic and acoustic features. Experimental results show that the learning capability of the proposed structure is much higher than that of the conventional structures. Objective and subjective listening test results also verify the superiority of the proposed structure.
Original language | English |
---|---|
Title of host publication | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9789881476821 |
DOIs | |
Publication status | Published - 2017 Jan 17 |
Event | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 - Jeju, Korea, Republic of Duration: 2016 Dec 13 → 2016 Dec 16 |
Publication series
Name | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 |
---|
Other
Other | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju |
Period | 16/12/13 → 16/12/16 |
Bibliographical note
Publisher Copyright:© 2016 Asia Pacific Signal and Information Processing Association.
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Science Applications
- Information Systems
- Signal Processing