Vision based speaker location detection

Jaehyun Lim, Jonggeun Park, Chulhee Lee

Research output: Contribution to journalConference articlepeer-review


Generally, speaker location detection in video conferencing is audio-based. However, physical room environment which is beyond the control of the speaker detection system can severely change room acoustics. Room acoustics introduce interference and can deteriorate the performance of audio-based speaker detection system. In this paper, we propose a video-based speaker detection method which can be used independently or along with audio-based detection systems. The information on speaker location is intended to create 3-dimensional audio reproduction in order to provide more reality to video conference. In the proposed method, we detect moving lips in video sequences. We first detect lips using color information and determine whether the lips are moving. Experiments with real videos provide promising results.

Original languageEnglish
Article number102
Pages (from-to)904-911
Number of pages8
JournalProceedings of SPIE - The International Society for Optical Engineering
Issue numberPART 2
Publication statusPublished - 2005
EventProceedings of SPIE-IS and T Electronic Imaging - Image and Video Communications and Processing 2005 - San Jose, CA, United States
Duration: 2005 Jan 182005 Jan 20

All Science Journal Classification (ASJC) codes

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering


Dive into the research topics of 'Vision based speaker location detection'. Together they form a unique fingerprint.

Cite this