Speaker adaptation based on a maximum observation probability criterion

Tae Young Yang, Chungyong Lee, Dae Hee Youn

Research output: Contribution to journalArticle

Abstract

SUMMARY A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean /i and the covariance £iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

Original languageEnglish
Pages (from-to)286-288
Number of pages3
JournalIEICE Transactions on Information and Systems
VolumeE84-D
Issue number2
Publication statusPublished - 2001 Jan 1

Fingerprint

Hidden Markov models
Interpolation
Acoustics

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Cite this

@article{ac9a513d7fe34e358860993a5d80165f,
title = "Speaker adaptation based on a maximum observation probability criterion",
abstract = "SUMMARY A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean /i and the covariance £iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03{\%} average enhancement in the male speaker-dependent recognizer, 52.18{\%} in the female speaker-dependent recognizer, and 9.84{\%} in the speaker-independent recognizer.",
author = "Yang, {Tae Young} and Chungyong Lee and Youn, {Dae Hee}",
year = "2001",
month = "1",
day = "1",
language = "English",
volume = "E84-D",
pages = "286--288",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "2",

}

Speaker adaptation based on a maximum observation probability criterion. / Yang, Tae Young; Lee, Chungyong; Youn, Dae Hee.

In: IEICE Transactions on Information and Systems, Vol. E84-D, No. 2, 01.01.2001, p. 286-288.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Speaker adaptation based on a maximum observation probability criterion

AU - Yang, Tae Young

AU - Lee, Chungyong

AU - Youn, Dae Hee

PY - 2001/1/1

Y1 - 2001/1/1

N2 - SUMMARY A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean /i and the covariance £iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

AB - SUMMARY A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean /i and the covariance £iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

UR - http://www.scopus.com/inward/record.url?scp=0034830454&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034830454&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0034830454

VL - E84-D

SP - 286

EP - 288

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 2

ER -