Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph

Minwoo Kim, Deokho Kim, Kyungah Kim, Won Woo Ro

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.

Original languageEnglish
Article number7067394
Pages (from-to)736-749
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume26
Issue number4
DOIs
Publication statusPublished - 2016 Apr 1

Fingerprint

Image coding
Scalability
Parallel algorithms

All Science Journal Classification (ASJC) codes

  • Media Technology
  • Electrical and Electronic Engineering

Cite this

@article{cc5346b1a7e24636b11173c8de2e3af1,
title = "Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph",
abstract = "This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.",
author = "Minwoo Kim and Deokho Kim and Kyungah Kim and Ro, {Won Woo}",
year = "2016",
month = "4",
day = "1",
doi = "10.1109/TCSVT.2015.2416556",
language = "English",
volume = "26",
pages = "736--749",
journal = "IEEE Transactions on Circuits and Systems for Video Technology",
issn = "1051-8215",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph. / Kim, Minwoo; Kim, Deokho; Kim, Kyungah; Ro, Won Woo.

In: IEEE Transactions on Circuits and Systems for Video Technology, Vol. 26, No. 4, 7067394, 01.04.2016, p. 736-749.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph

AU - Kim, Minwoo

AU - Kim, Deokho

AU - Kim, Kyungah

AU - Ro, Won Woo

PY - 2016/4/1

Y1 - 2016/4/1

N2 - This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.

AB - This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.

UR - http://www.scopus.com/inward/record.url?scp=84963800740&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963800740&partnerID=8YFLogxK

U2 - 10.1109/TCSVT.2015.2416556

DO - 10.1109/TCSVT.2015.2416556

M3 - Article

VL - 26

SP - 736

EP - 749

JO - IEEE Transactions on Circuits and Systems for Video Technology

JF - IEEE Transactions on Circuits and Systems for Video Technology

SN - 1051-8215

IS - 4

M1 - 7067394

ER -