Parametric-based non-intrusive speech quality assessment by deep neural network

Haemin Yang, Kyungguen Byun, Hong Goo Kang, Youngsu Kwak

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages99-103
Number of pages5
ISBN (Electronic)9781509041657
DOIs
Publication statusPublished - 2016 Jul 2
Event2016 IEEE International Conference on Digital Signal Processing, DSP 2016 - Beijing, China
Duration: 2016 Oct 162016 Oct 18

Publication series

NameInternational Conference on Digital Signal Processing, DSP
Volume0

Other

Other2016 IEEE International Conference on Digital Signal Processing, DSP 2016
CountryChina
CityBeijing
Period16/10/1616/10/18

Fingerprint

Real time control
Quality of service
Speech communication
Jitter
Communication systems
Deep neural networks

All Science Journal Classification (ASJC) codes

  • Signal Processing

Cite this

Yang, H., Byun, K., Kang, H. G., & Kwak, Y. (2016). Parametric-based non-intrusive speech quality assessment by deep neural network. In Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016 (pp. 99-103). [7868524] (International Conference on Digital Signal Processing, DSP; Vol. 0). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDSP.2016.7868524
Yang, Haemin ; Byun, Kyungguen ; Kang, Hong Goo ; Kwak, Youngsu. / Parametric-based non-intrusive speech quality assessment by deep neural network. Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 99-103 (International Conference on Digital Signal Processing, DSP).
@inproceedings{5fa7502f93e142ffb5d4aefd09aade1f,
title = "Parametric-based non-intrusive speech quality assessment by deep neural network",
abstract = "This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.",
author = "Haemin Yang and Kyungguen Byun and Kang, {Hong Goo} and Youngsu Kwak",
year = "2016",
month = "7",
day = "2",
doi = "10.1109/ICDSP.2016.7868524",
language = "English",
series = "International Conference on Digital Signal Processing, DSP",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "99--103",
booktitle = "Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016",
address = "United States",

}

Yang, H, Byun, K, Kang, HG & Kwak, Y 2016, Parametric-based non-intrusive speech quality assessment by deep neural network. in Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016., 7868524, International Conference on Digital Signal Processing, DSP, vol. 0, Institute of Electrical and Electronics Engineers Inc., pp. 99-103, 2016 IEEE International Conference on Digital Signal Processing, DSP 2016, Beijing, China, 16/10/16. https://doi.org/10.1109/ICDSP.2016.7868524

Parametric-based non-intrusive speech quality assessment by deep neural network. / Yang, Haemin; Byun, Kyungguen; Kang, Hong Goo; Kwak, Youngsu.

Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 99-103 7868524 (International Conference on Digital Signal Processing, DSP; Vol. 0).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Parametric-based non-intrusive speech quality assessment by deep neural network

AU - Yang, Haemin

AU - Byun, Kyungguen

AU - Kang, Hong Goo

AU - Kwak, Youngsu

PY - 2016/7/2

Y1 - 2016/7/2

N2 - This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.

AB - This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.

UR - http://www.scopus.com/inward/record.url?scp=85016172010&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016172010&partnerID=8YFLogxK

U2 - 10.1109/ICDSP.2016.7868524

DO - 10.1109/ICDSP.2016.7868524

M3 - Conference contribution

AN - SCOPUS:85016172010

T3 - International Conference on Digital Signal Processing, DSP

SP - 99

EP - 103

BT - Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Yang H, Byun K, Kang HG, Kwak Y. Parametric-based non-intrusive speech quality assessment by deep neural network. In Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 99-103. 7868524. (International Conference on Digital Signal Processing, DSP). https://doi.org/10.1109/ICDSP.2016.7868524