Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment

Soonho Baek, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper presents the effect of mean normalization to various types of cepstral coefficients for robust speech recognition in noisy environments. Although the cepstral mean normalization (CMN) technique was originally designed to compensate channel distortion, it has also been proved that the CMN also improves recognition accuracy in additive noisy environment. However, no one has yet considered the interaction of CMN with spectral mapping functions required for extracting cepstral features. This paper investigates the impact of CMN to the speech recognition system depending on the types of spectral mapping function by mathematically analyzing the amount of spectral distortion between clean and noisy conditions. The analytic result is also confirmed by comparing the type of recognition error patterns in automatic speech recognition experiment with Aurora 2 database. Experimental results show that the performance improvement by adopting CMN becomes significant if the logarithmic function is replaced with the appropriate setting of fractional power mapping function. Especially, the deletion errors are dramatically reduced.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1735-1739
Number of pages5
ISBN (Print)9781479928927
DOIs
Publication statusPublished - 2014
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: 2014 May 42014 May 9

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
CountryItaly
CityFlorence
Period14/5/414/5/9

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment'. Together they form a unique fingerprint.

  • Cite this

    Baek, S., & Kang, H. G. (2014). Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 (pp. 1735-1739). [6853895] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6853895