Audio-visual synchronization recovery in multimedia content

Jong Seok Lee, Touradj Ebrahimi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper proposes a method recovering audio-visual synchronization of multimedia content. It exploits the correlation between the acoustic and the visual signals in order to estimate the audio-visual drift existing in the content. By shifting the audio signal relative to the visual signal, the estimation of the drift is obtained by searching for the shift producing the maximal audio-visual correlation. We consider two correlation measures, namely, mutual information and canonical correlation, and compare their performance. Experimental results demonstrate that the method using the canonical correlation is effective in recovering the audio-visual synchronization for both speech and non-speech sequences.

Original languageEnglish
Title of host publication2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
Pages2280-2283
Number of pages4
DOIs
Publication statusPublished - 2011 Aug 18
Event36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Prague, Czech Republic
Duration: 2011 May 222011 May 27

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
CountryCzech Republic
CityPrague
Period11/5/2211/5/27

Fingerprint

Synchronization
Recovery
Acoustics

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Lee, J. S., & Ebrahimi, T. (2011). Audio-visual synchronization recovery in multimedia content. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings (pp. 2280-2283). [5946937] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2011.5946937
Lee, Jong Seok ; Ebrahimi, Touradj. / Audio-visual synchronization recovery in multimedia content. 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. pp. 2280-2283 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{8cfe398e4ca54361994b09c55fe9f3ab,
title = "Audio-visual synchronization recovery in multimedia content",
abstract = "This paper proposes a method recovering audio-visual synchronization of multimedia content. It exploits the correlation between the acoustic and the visual signals in order to estimate the audio-visual drift existing in the content. By shifting the audio signal relative to the visual signal, the estimation of the drift is obtained by searching for the shift producing the maximal audio-visual correlation. We consider two correlation measures, namely, mutual information and canonical correlation, and compare their performance. Experimental results demonstrate that the method using the canonical correlation is effective in recovering the audio-visual synchronization for both speech and non-speech sequences.",
author = "Lee, {Jong Seok} and Touradj Ebrahimi",
year = "2011",
month = "8",
day = "18",
doi = "10.1109/ICASSP.2011.5946937",
language = "English",
isbn = "9781457705397",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
pages = "2280--2283",
booktitle = "2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings",

}

Lee, JS & Ebrahimi, T 2011, Audio-visual synchronization recovery in multimedia content. in 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings., 5946937, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 2280-2283, 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, Prague, Czech Republic, 11/5/22. https://doi.org/10.1109/ICASSP.2011.5946937

Audio-visual synchronization recovery in multimedia content. / Lee, Jong Seok; Ebrahimi, Touradj.

2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 2280-2283 5946937 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Audio-visual synchronization recovery in multimedia content

AU - Lee, Jong Seok

AU - Ebrahimi, Touradj

PY - 2011/8/18

Y1 - 2011/8/18

N2 - This paper proposes a method recovering audio-visual synchronization of multimedia content. It exploits the correlation between the acoustic and the visual signals in order to estimate the audio-visual drift existing in the content. By shifting the audio signal relative to the visual signal, the estimation of the drift is obtained by searching for the shift producing the maximal audio-visual correlation. We consider two correlation measures, namely, mutual information and canonical correlation, and compare their performance. Experimental results demonstrate that the method using the canonical correlation is effective in recovering the audio-visual synchronization for both speech and non-speech sequences.

AB - This paper proposes a method recovering audio-visual synchronization of multimedia content. It exploits the correlation between the acoustic and the visual signals in order to estimate the audio-visual drift existing in the content. By shifting the audio signal relative to the visual signal, the estimation of the drift is obtained by searching for the shift producing the maximal audio-visual correlation. We consider two correlation measures, namely, mutual information and canonical correlation, and compare their performance. Experimental results demonstrate that the method using the canonical correlation is effective in recovering the audio-visual synchronization for both speech and non-speech sequences.

UR - http://www.scopus.com/inward/record.url?scp=80051618714&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051618714&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2011.5946937

DO - 10.1109/ICASSP.2011.5946937

M3 - Conference contribution

AN - SCOPUS:80051618714

SN - 9781457705397

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2280

EP - 2283

BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings

ER -

Lee JS, Ebrahimi T. Audio-visual synchronization recovery in multimedia content. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 2280-2283. 5946937. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2011.5946937