Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech

Jung Won Lee, Jeung Yoon Choi, Hong Goo Kang

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.

Original languageEnglish
Pages (from-to)1536-1546
Number of pages11
JournalJournal of the Acoustical Society of America
Volume131
Issue number2
DOIs
Publication statusPublished - 2012 Feb 1

Fingerprint

phonetics
telephones
vowels
extrapolation
cues
acoustics
coefficients
speech recognition
bursts
modules
Telephone
Acoustic Phonetics
Consonant
Phonetic Features
Extrapolation
Acoustic Cues

All Science Journal Classification (ASJC) codes

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

@article{702e83b5a1c54c2992234208b0a448d2,
title = "Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech",
abstract = "Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.",
author = "Lee, {Jung Won} and Choi, {Jeung Yoon} and Kang, {Hong Goo}",
year = "2012",
month = "2",
day = "1",
doi = "10.1121/1.3672706",
language = "English",
volume = "131",
pages = "1536--1546",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "2",

}

Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech. / Lee, Jung Won; Choi, Jeung Yoon; Kang, Hong Goo.

In: Journal of the Acoustical Society of America, Vol. 131, No. 2, 01.02.2012, p. 1536-1546.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech

AU - Lee, Jung Won

AU - Choi, Jeung Yoon

AU - Kang, Hong Goo

PY - 2012/2/1

Y1 - 2012/2/1

N2 - Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.

AB - Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.

UR - http://www.scopus.com/inward/record.url?scp=84863147190&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863147190&partnerID=8YFLogxK

U2 - 10.1121/1.3672706

DO - 10.1121/1.3672706

M3 - Article

C2 - 22352523

AN - SCOPUS:84863147190

VL - 131

SP - 1536

EP - 1546

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 2

ER -