Abstract
Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.
Original language | English |
---|---|
Pages (from-to) | 1536-1546 |
Number of pages | 11 |
Journal | Journal of the Acoustical Society of America |
Volume | 131 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2012 Feb 1 |
Fingerprint
All Science Journal Classification (ASJC) codes
- Arts and Humanities (miscellaneous)
- Acoustics and Ultrasonics
Cite this
}
Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech. / Lee, Jung Won; Choi, Jeung Yoon; Kang, Hong Goo.
In: Journal of the Acoustical Society of America, Vol. 131, No. 2, 01.02.2012, p. 1536-1546.Research output: Contribution to journal › Article
TY - JOUR
T1 - Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech
AU - Lee, Jung Won
AU - Choi, Jeung Yoon
AU - Kang, Hong Goo
PY - 2012/2/1
Y1 - 2012/2/1
N2 - Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.
AB - Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.
UR - http://www.scopus.com/inward/record.url?scp=84863147190&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863147190&partnerID=8YFLogxK
U2 - 10.1121/1.3672706
DO - 10.1121/1.3672706
M3 - Article
C2 - 22352523
AN - SCOPUS:84863147190
VL - 131
SP - 1536
EP - 1546
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
SN - 0001-4966
IS - 2
ER -