A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise

Hyunson Seo, Hong Goo Kang, Frank Soong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

We propose a novel bandwidth expansion algorithm for extending narrowband speech signal to wideband by exploiting segment examples pre-stored in a speaker independent database. Both narrowband and wideband representation of speech signals are pre-stored in the corpus and they are dynamically chopped into variable length segments. Narrowband segments are used dynamically to explain a given narrowband input sentence while the wideband expanded version of the input sentence is constructed correspondingly. The matching process in the narrowband favors a longer segment patch by the chosen Maximum A Posterior (MAP) criterion. As a result, the multiple choices in matching process are significantly reduced with the MAP criterion in decoding. The approach is further generalized to deal with noise corrupted narrowband input signals and the well-known Vector Taylor Series (VTS) noise adaptation algorithm is incorporated into the matching and bandwidth expansion process. A series of experiments is performed to validate the approach on both clean and noise corrupted narrowband speech where both car noise and babble noise corrupted samples are tested.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6087-6091
Number of pages5
ISBN (Print)9781479928927
DOIs
Publication statusPublished - 2014 Jan 1
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: 2014 May 42014 May 9

Other

Other2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
CountryItaly
CityFlorence
Period14/5/414/5/9

Fingerprint

Acoustic noise
Bandwidth
Taylor series
Decoding
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Seo, H., Kang, H. G., & Soong, F. (2014). A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 (pp. 6087-6091). [6854773] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6854773
Seo, Hyunson ; Kang, Hong Goo ; Soong, Frank. / A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 6087-6091
@inproceedings{a0d4fcfbd7c44c8b8395ed30c8ef1337,
title = "A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise",
abstract = "We propose a novel bandwidth expansion algorithm for extending narrowband speech signal to wideband by exploiting segment examples pre-stored in a speaker independent database. Both narrowband and wideband representation of speech signals are pre-stored in the corpus and they are dynamically chopped into variable length segments. Narrowband segments are used dynamically to explain a given narrowband input sentence while the wideband expanded version of the input sentence is constructed correspondingly. The matching process in the narrowband favors a longer segment patch by the chosen Maximum A Posterior (MAP) criterion. As a result, the multiple choices in matching process are significantly reduced with the MAP criterion in decoding. The approach is further generalized to deal with noise corrupted narrowband input signals and the well-known Vector Taylor Series (VTS) noise adaptation algorithm is incorporated into the matching and bandwidth expansion process. A series of experiments is performed to validate the approach on both clean and noise corrupted narrowband speech where both car noise and babble noise corrupted samples are tested.",
author = "Hyunson Seo and Kang, {Hong Goo} and Frank Soong",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/ICASSP.2014.6854773",
language = "English",
isbn = "9781479928927",
pages = "6087--6091",
booktitle = "2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Seo, H, Kang, HG & Soong, F 2014, A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise. in 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014., 6854773, Institute of Electrical and Electronics Engineers Inc., pp. 6087-6091, 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Florence, Italy, 14/5/4. https://doi.org/10.1109/ICASSP.2014.6854773

A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise. / Seo, Hyunson; Kang, Hong Goo; Soong, Frank.

2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 6087-6091 6854773.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise

AU - Seo, Hyunson

AU - Kang, Hong Goo

AU - Soong, Frank

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We propose a novel bandwidth expansion algorithm for extending narrowband speech signal to wideband by exploiting segment examples pre-stored in a speaker independent database. Both narrowband and wideband representation of speech signals are pre-stored in the corpus and they are dynamically chopped into variable length segments. Narrowband segments are used dynamically to explain a given narrowband input sentence while the wideband expanded version of the input sentence is constructed correspondingly. The matching process in the narrowband favors a longer segment patch by the chosen Maximum A Posterior (MAP) criterion. As a result, the multiple choices in matching process are significantly reduced with the MAP criterion in decoding. The approach is further generalized to deal with noise corrupted narrowband input signals and the well-known Vector Taylor Series (VTS) noise adaptation algorithm is incorporated into the matching and bandwidth expansion process. A series of experiments is performed to validate the approach on both clean and noise corrupted narrowband speech where both car noise and babble noise corrupted samples are tested.

AB - We propose a novel bandwidth expansion algorithm for extending narrowband speech signal to wideband by exploiting segment examples pre-stored in a speaker independent database. Both narrowband and wideband representation of speech signals are pre-stored in the corpus and they are dynamically chopped into variable length segments. Narrowband segments are used dynamically to explain a given narrowband input sentence while the wideband expanded version of the input sentence is constructed correspondingly. The matching process in the narrowband favors a longer segment patch by the chosen Maximum A Posterior (MAP) criterion. As a result, the multiple choices in matching process are significantly reduced with the MAP criterion in decoding. The approach is further generalized to deal with noise corrupted narrowband input signals and the well-known Vector Taylor Series (VTS) noise adaptation algorithm is incorporated into the matching and bandwidth expansion process. A series of experiments is performed to validate the approach on both clean and noise corrupted narrowband speech where both car noise and babble noise corrupted samples are tested.

UR - http://www.scopus.com/inward/record.url?scp=84905226964&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905226964&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2014.6854773

DO - 10.1109/ICASSP.2014.6854773

M3 - Conference contribution

SN - 9781479928927

SP - 6087

EP - 6091

BT - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Seo H, Kang HG, Soong F. A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 6087-6091. 6854773 https://doi.org/10.1109/ICASSP.2014.6854773