Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system

Eunwoo Song, Young Sun Joo, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

This paper proposes an improved time-frequency trajectory excitation (TFTE) modeling method for a statistical parametric speech synthesis system. The proposed approach overcomes the dimensional variation problem of the training process caused by the inherent nature of the pitch-dependent analysis paradigm. By reducing the redundancies of the parameters using predicted average block coefficients (PABC), the proposed algorithm efficiently models excitation, even if its dimension is varied. Objective and subjective test results verify that the proposed algorithm provides not only robustness to the training process but also naturalness to the synthesized speech.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4949-4953
Number of pages5
ISBN (Electronic)9781467369978
DOIs
Publication statusPublished - 2015 Aug 4
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: 2014 Apr 192014 Apr 24

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
CountryAustralia
CityBrisbane
Period14/4/1914/4/24

Fingerprint

Speech synthesis
Trajectories
Redundancy

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Song, E., Joo, Y. S., & Kang, H. G. (2015). Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings (pp. 4949-4953). [7178912] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7178912
Song, Eunwoo ; Joo, Young Sun ; Kang, Hong Goo. / Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system. 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 4949-4953 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{5c7db542fb974abd880d99c381a16957,
title = "Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system",
abstract = "This paper proposes an improved time-frequency trajectory excitation (TFTE) modeling method for a statistical parametric speech synthesis system. The proposed approach overcomes the dimensional variation problem of the training process caused by the inherent nature of the pitch-dependent analysis paradigm. By reducing the redundancies of the parameters using predicted average block coefficients (PABC), the proposed algorithm efficiently models excitation, even if its dimension is varied. Objective and subjective test results verify that the proposed algorithm provides not only robustness to the training process but also naturalness to the synthesized speech.",
author = "Eunwoo Song and Joo, {Young Sun} and Kang, {Hong Goo}",
year = "2015",
month = "8",
day = "4",
doi = "10.1109/ICASSP.2015.7178912",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "4949--4953",
booktitle = "2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings",
address = "United States",

}

Song, E, Joo, YS & Kang, HG 2015, Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system. in 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings., 7178912, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2015-August, Institute of Electrical and Electronics Engineers Inc., pp. 4949-4953, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 14/4/19. https://doi.org/10.1109/ICASSP.2015.7178912

Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system. / Song, Eunwoo; Joo, Young Sun; Kang, Hong Goo.

2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. p. 4949-4953 7178912 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system

AU - Song, Eunwoo

AU - Joo, Young Sun

AU - Kang, Hong Goo

PY - 2015/8/4

Y1 - 2015/8/4

N2 - This paper proposes an improved time-frequency trajectory excitation (TFTE) modeling method for a statistical parametric speech synthesis system. The proposed approach overcomes the dimensional variation problem of the training process caused by the inherent nature of the pitch-dependent analysis paradigm. By reducing the redundancies of the parameters using predicted average block coefficients (PABC), the proposed algorithm efficiently models excitation, even if its dimension is varied. Objective and subjective test results verify that the proposed algorithm provides not only robustness to the training process but also naturalness to the synthesized speech.

AB - This paper proposes an improved time-frequency trajectory excitation (TFTE) modeling method for a statistical parametric speech synthesis system. The proposed approach overcomes the dimensional variation problem of the training process caused by the inherent nature of the pitch-dependent analysis paradigm. By reducing the redundancies of the parameters using predicted average block coefficients (PABC), the proposed algorithm efficiently models excitation, even if its dimension is varied. Objective and subjective test results verify that the proposed algorithm provides not only robustness to the training process but also naturalness to the synthesized speech.

UR - http://www.scopus.com/inward/record.url?scp=84946032846&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946032846&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2015.7178912

DO - 10.1109/ICASSP.2015.7178912

M3 - Conference contribution

AN - SCOPUS:84946032846

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 4949

EP - 4953

BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Song E, Joo YS, Kang HG. Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2015. p. 4949-4953. 7178912. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2015.7178912