TY - GEN
T1 - Parametric-based non-intrusive speech quality assessment by deep neural network
AU - Yang, Haemin
AU - Byun, Kyungguen
AU - Kang, Hong Goo
AU - Kwak, Youngsu
PY - 2016/7/2
Y1 - 2016/7/2
N2 - This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.
AB - This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-Time voice communication systems. Since the proposed method only utilizes real-Time control protocol (RTCP) information in the receiver side and does not need a reference signal, it is possible to continuously monitor the quality of service (QoS). Unlike the conventional non-intrusive E-model system that predicts QoS by utilizing delay, jitter, and type of codec with a rule-based method, the proposed method actively estimates the non-linear relationship between multi-dimensional parameters of RTCP and subjectively motivated reference scores using a DNN structure. In order to select efficient features, the relationship between each parameter of RTCP and perceptual objective listening quality assessment (POLQA) is thoroughly investigated, then we train the DNN model by changing the number of layers and nodes. The proposed algorithm achieved 0.8693 correlation with 21,206 reference POLQA scores that are sampled from real environment.
UR - http://www.scopus.com/inward/record.url?scp=85016172010&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85016172010&partnerID=8YFLogxK
U2 - 10.1109/ICDSP.2016.7868524
DO - 10.1109/ICDSP.2016.7868524
M3 - Conference contribution
T3 - International Conference on Digital Signal Processing, DSP
SP - 99
EP - 103
BT - Proceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016
Y2 - 16 October 2016 through 18 October 2016
ER -