Selecting feature frames for automatic speaker recognition using mutual information

Chi Sang Jung, Moo Young Kim, Hong Goo Kang

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)


In this paper, an information theoretic approach to selecting feature frames for speaker recognition systems is proposed. A conventional approach in which the frame shift is fixed to around half of the frame length may not be the best choice, because the characteristics of the speech signal may rapidly change, especially at phonetic boundaries. Experimental results show that the recognition accuracy increases if the frame interval is directly controlled using phonetic information. By applying these results to the well-known fact that the recognition accuracy is directly correlated with the amount of mutual information, this paper suggests a novel feature frame selection method for speaker recognition. Specifically, feature frames are chosen to have minimum-redundancy within selected feature frames, but maximum-relevancy to speaker models. It is verified by experiments that the proposed method produces consistent improvement, especially in a speaker verification system. It is also robust against variations in acoustic environment.

Original languageEnglish
Article number5276841
Pages (from-to)1332-1340
Number of pages9
JournalIEEE Transactions on Audio, Speech and Language Processing
Issue number6
Publication statusPublished - 2010

Bibliographical note

Funding Information:
Manuscript received December 01, 2008; revised September 19, 2009. First published October 02, 2009; current version published July 14, 2010. This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Biometrics Engineering Research Center (BERC) at Yonsei University (R112002105070040 (2009)). The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Haizhou Li.

All Science Journal Classification (ASJC) codes

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering


Dive into the research topics of 'Selecting feature frames for automatic speaker recognition using mutual information'. Together they form a unique fingerprint.

Cite this