Audiovisual focus of attention and its application to Ultra High Definition video compression

Martin Rerabek, Hiromi Nemoto, Jong-Seok Lee, Touradj Ebrahimi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.

Original languageEnglish
Title of host publicationProceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX
PublisherSPIE
ISBN (Print)9780819499318
DOIs
Publication statusPublished - 2014 Jan 1
EventHuman Vision and Electronic Imaging XIX - San Francisco, CA, United States
Duration: 2014 Feb 32014 Feb 6

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume9014
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Other

OtherHuman Vision and Electronic Imaging XIX
CountryUnited States
CitySan Francisco, CA
Period14/2/314/2/6

Fingerprint

video compression
Video Compression
Image compression
coding
Compression
Coding
video signals
audio signals
decoders
Image Compression
coders
Region of Interest
Encoder
Low Complexity
Vary
causes
Alternatives

All Science Journal Classification (ASJC) codes

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Cite this

Rerabek, M., Nemoto, H., Lee, J-S., & Ebrahimi, T. (2014). Audiovisual focus of attention and its application to Ultra High Definition video compression. In Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX [901407] (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 9014). SPIE. https://doi.org/10.1117/12.2047850
Rerabek, Martin ; Nemoto, Hiromi ; Lee, Jong-Seok ; Ebrahimi, Touradj. / Audiovisual focus of attention and its application to Ultra High Definition video compression. Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX. SPIE, 2014. (Proceedings of SPIE - The International Society for Optical Engineering).
@inproceedings{efe8786891e5450aad5b1717126f66ea,
title = "Audiovisual focus of attention and its application to Ultra High Definition video compression",
abstract = "Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.",
author = "Martin Rerabek and Hiromi Nemoto and Jong-Seok Lee and Touradj Ebrahimi",
year = "2014",
month = "1",
day = "1",
doi = "10.1117/12.2047850",
language = "English",
isbn = "9780819499318",
series = "Proceedings of SPIE - The International Society for Optical Engineering",
publisher = "SPIE",
booktitle = "Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX",
address = "United States",

}

Rerabek, M, Nemoto, H, Lee, J-S & Ebrahimi, T 2014, Audiovisual focus of attention and its application to Ultra High Definition video compression. in Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX., 901407, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9014, SPIE, Human Vision and Electronic Imaging XIX, San Francisco, CA, United States, 14/2/3. https://doi.org/10.1117/12.2047850

Audiovisual focus of attention and its application to Ultra High Definition video compression. / Rerabek, Martin; Nemoto, Hiromi; Lee, Jong-Seok; Ebrahimi, Touradj.

Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX. SPIE, 2014. 901407 (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 9014).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Audiovisual focus of attention and its application to Ultra High Definition video compression

AU - Rerabek, Martin

AU - Nemoto, Hiromi

AU - Lee, Jong-Seok

AU - Ebrahimi, Touradj

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.

AB - Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.

UR - http://www.scopus.com/inward/record.url?scp=84897552131&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897552131&partnerID=8YFLogxK

U2 - 10.1117/12.2047850

DO - 10.1117/12.2047850

M3 - Conference contribution

SN - 9780819499318

T3 - Proceedings of SPIE - The International Society for Optical Engineering

BT - Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX

PB - SPIE

ER -

Rerabek M, Nemoto H, Lee J-S, Ebrahimi T. Audiovisual focus of attention and its application to Ultra High Definition video compression. In Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XIX. SPIE. 2014. 901407. (Proceedings of SPIE - The International Society for Optical Engineering). https://doi.org/10.1117/12.2047850