On the study of noise allocation for speech signal in low bit-rate audio coding

Chang Heon Lee, Hyen O. Oh, Hong Goo Kang

Research output: Contribution to journalArticle

Abstract

This letter proposes a new masking threshold adjustment method to improve the quality for the speech signals in low bit-rate audio coding. The Enhanced aacPlus (EAAC) audio codec increases the masking threshold of all frequency bands to be suitable for the given encoding rate by considering equal loudness noises only, which is a representative way for implementing the adjustment technique. The proposed method, however, dynamically adjusts the masking threshold of each frequency band based on the energy ratio of each band to the average band energy. More quantization noises are added to formant regions that have relatively large energy ratio values, but less distortion is allowed in spectral valley regions, which eventually helps to enhance perceptual quality for speech signals. The proposed idea reflects the spectral weighting criterion in searching optimal excitation codebooks used in many speech coding algorithms. Simulation results confirm that the proposed method implemented on the EAAC coder improves quality for the speech input signals at the same bit-rate while keeping equivalent quality for music contents.

Original languageEnglish
Article number2025982
Pages (from-to)849-852
Number of pages4
JournalIEEE Signal Processing Letters
Volume16
Issue number10
DOIs
Publication statusPublished - 2009 Dec 1

Fingerprint

Speech Signal
Masking
Coding
Frequency bands
Speech coding
Adjustment
Energy
Speech Coding
Band structure
Codebook
Music
Weighting
Quantization
Encoding
Excitation
Simulation

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering
  • Applied Mathematics

Cite this

@article{40cbaf86a476497197f37eb5bbd469c2,
title = "On the study of noise allocation for speech signal in low bit-rate audio coding",
abstract = "This letter proposes a new masking threshold adjustment method to improve the quality for the speech signals in low bit-rate audio coding. The Enhanced aacPlus (EAAC) audio codec increases the masking threshold of all frequency bands to be suitable for the given encoding rate by considering equal loudness noises only, which is a representative way for implementing the adjustment technique. The proposed method, however, dynamically adjusts the masking threshold of each frequency band based on the energy ratio of each band to the average band energy. More quantization noises are added to formant regions that have relatively large energy ratio values, but less distortion is allowed in spectral valley regions, which eventually helps to enhance perceptual quality for speech signals. The proposed idea reflects the spectral weighting criterion in searching optimal excitation codebooks used in many speech coding algorithms. Simulation results confirm that the proposed method implemented on the EAAC coder improves quality for the speech input signals at the same bit-rate while keeping equivalent quality for music contents.",
author = "Lee, {Chang Heon} and Oh, {Hyen O.} and Kang, {Hong Goo}",
year = "2009",
month = "12",
day = "1",
doi = "10.1109/LSP.2009.2025982",
language = "English",
volume = "16",
pages = "849--852",
journal = "IEEE Signal Processing Letters",
issn = "1070-9908",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "10",

}

On the study of noise allocation for speech signal in low bit-rate audio coding. / Lee, Chang Heon; Oh, Hyen O.; Kang, Hong Goo.

In: IEEE Signal Processing Letters, Vol. 16, No. 10, 2025982, 01.12.2009, p. 849-852.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On the study of noise allocation for speech signal in low bit-rate audio coding

AU - Lee, Chang Heon

AU - Oh, Hyen O.

AU - Kang, Hong Goo

PY - 2009/12/1

Y1 - 2009/12/1

N2 - This letter proposes a new masking threshold adjustment method to improve the quality for the speech signals in low bit-rate audio coding. The Enhanced aacPlus (EAAC) audio codec increases the masking threshold of all frequency bands to be suitable for the given encoding rate by considering equal loudness noises only, which is a representative way for implementing the adjustment technique. The proposed method, however, dynamically adjusts the masking threshold of each frequency band based on the energy ratio of each band to the average band energy. More quantization noises are added to formant regions that have relatively large energy ratio values, but less distortion is allowed in spectral valley regions, which eventually helps to enhance perceptual quality for speech signals. The proposed idea reflects the spectral weighting criterion in searching optimal excitation codebooks used in many speech coding algorithms. Simulation results confirm that the proposed method implemented on the EAAC coder improves quality for the speech input signals at the same bit-rate while keeping equivalent quality for music contents.

AB - This letter proposes a new masking threshold adjustment method to improve the quality for the speech signals in low bit-rate audio coding. The Enhanced aacPlus (EAAC) audio codec increases the masking threshold of all frequency bands to be suitable for the given encoding rate by considering equal loudness noises only, which is a representative way for implementing the adjustment technique. The proposed method, however, dynamically adjusts the masking threshold of each frequency band based on the energy ratio of each band to the average band energy. More quantization noises are added to formant regions that have relatively large energy ratio values, but less distortion is allowed in spectral valley regions, which eventually helps to enhance perceptual quality for speech signals. The proposed idea reflects the spectral weighting criterion in searching optimal excitation codebooks used in many speech coding algorithms. Simulation results confirm that the proposed method implemented on the EAAC coder improves quality for the speech input signals at the same bit-rate while keeping equivalent quality for music contents.

UR - http://www.scopus.com/inward/record.url?scp=79959420912&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959420912&partnerID=8YFLogxK

U2 - 10.1109/LSP.2009.2025982

DO - 10.1109/LSP.2009.2025982

M3 - Article

AN - SCOPUS:79959420912

VL - 16

SP - 849

EP - 852

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

SN - 1070-9908

IS - 10

M1 - 2025982

ER -