PROGRESSIVE MULTI-STAGE NEURAL AUDIO CODING WITH GUIDED REFERENCES

Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong Goo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose an effective multi-stage neural audio coding algorithm that encodes full-band audio signals (up to 20 kHz) using an end-to-end training criterion. By predefining several dyadic subband signals as training targets, we progressively encode input audio signals in each stage such that deeper stages of the network encode the residual error terms from the previous encoding stage. Our proposed audio codec successfully decodes full-band audio signals by using an effective multi-stage vector quantization scheme to represent key encoding features extracted in the latent space. Subjective listening tests show that the decoded outputs of the proposed audio codec achieve almost transparent quality at an average bitrate of 132 kbps.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages876-880
Number of pages5
ISBN (Electronic)9781665405409
DOIs
Publication statusPublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 2022 May 232022 May 27

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period22/5/2322/5/27

Bibliographical note

Funding Information:
This work was supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government. [21ZH1200, The research of the basic media contents technologies]

Publisher Copyright:
© 2022 IEEE

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'PROGRESSIVE MULTI-STAGE NEURAL AUDIO CODING WITH GUIDED REFERENCES'. Together they form a unique fingerprint.

Cite this