Saliency prediction on stereoscopic videos

Haksub Kim, Sanghoon Lee, Alan Conrad Bovik

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

Original languageEnglish
Article number6728719
Pages (from-to)1476-1490
Number of pages15
JournalIEEE Transactions on Image Processing
Volume23
Issue number4
DOIs
Publication statusPublished - 2014 Apr 1

Fingerprint

Luminance
Textures
Cameras
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Graphics and Computer-Aided Design

Cite this

Kim, Haksub ; Lee, Sanghoon ; Bovik, Alan Conrad. / Saliency prediction on stereoscopic videos. In: IEEE Transactions on Image Processing. 2014 ; Vol. 23, No. 4. pp. 1476-1490.
@article{a4fc49c36ceb4c0f92d5ea1803146cda,
title = "Saliency prediction on stereoscopic videos",
abstract = "We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.",
author = "Haksub Kim and Sanghoon Lee and Bovik, {Alan Conrad}",
year = "2014",
month = "4",
day = "1",
doi = "10.1109/TIP.2014.2303640",
language = "English",
volume = "23",
pages = "1476--1490",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

Saliency prediction on stereoscopic videos. / Kim, Haksub; Lee, Sanghoon; Bovik, Alan Conrad.

In: IEEE Transactions on Image Processing, Vol. 23, No. 4, 6728719, 01.04.2014, p. 1476-1490.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Saliency prediction on stereoscopic videos

AU - Kim, Haksub

AU - Lee, Sanghoon

AU - Bovik, Alan Conrad

PY - 2014/4/1

Y1 - 2014/4/1

N2 - We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

AB - We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

UR - http://www.scopus.com/inward/record.url?scp=84897678080&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897678080&partnerID=8YFLogxK

U2 - 10.1109/TIP.2014.2303640

DO - 10.1109/TIP.2014.2303640

M3 - Article

VL - 23

SP - 1476

EP - 1490

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 4

M1 - 6728719

ER -