In this paper, we propose an effective active learning query strategy for an automatic speech recognition system with the aim of reducing the training cost. Generally, training a deep neural network with supervised learning requires a massive amount of labeled data to obtain excellent performance. However, labeling data is tedious and costly manual work. Active learning can solve this problem by choosing and only annotating informative instances, which presents better results even with less transcribed data. In this approach it is vitally important to accurately select informative samples. Based on the preliminary experiment results that true gradient length has the best performance in terms of measuring sample informativeness in ideal conditions, we propose utilizing both uncertainty and the expected gradient length criterion to approximate the true gradient length using a neural network. The experiment results show that our proposed method is superior to the conventional individual criterion when applied to a phoneme-based speech recognition system, and it has both a faster convergence speed and the greatest loss reduction in both clean and noisy conditions.
|Title of host publication||2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|Publication status||Published - 2019 May|
|Event||44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom|
Duration: 2019 May 12 → 2019 May 17
|Name||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|Conference||44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019|
|Period||19/5/12 → 19/5/17|
Bibliographical notePublisher Copyright:
© 2019 IEEE.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering