SIMD defragmenter: Efficient ILP realization on data-parallel architectures

Yongjun Park, Sangwon Seo, Hyunchul Park, Hyoun Kyu Cho, Scott Mahlke

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

Single-instruction multiple-data (SIMD) accelerators provide an energy-efficient platform to scale the performance of mobile systems while still retaining post-programmability. The central challenge is translating the parallel resources of the SIMD hardware into real application performance. In scientific applications, automatic vectorization techniques have proven quite effective at extracting large levels of data-level parallelism (DLP). However, vectorization is often much less effective for media applications due to low trip count loops, complex control flow, and non-uniform execution behavior. As a result, SIMD lanes remain idle due to insufficient DLP. To attack this problem, this paper proposes a new vectorization pass called SIMD Defragmenter to uncover hidden DLP that lurks below the surface in the form of instruction-level parallelism (ILP). The difficulty is managing the data packing/unpacking overhead that can easily exceed the benefits gained through SIMD execution. The SIMD degragmenter overcomes this problem by identifying groups of compatible instructions (subgraphs) that can be executed in parallel across the SIMD lanes. By SIMDizing in bulk at the subgraph level, packing/unpacking overhead is minimized. On a 16-lane SIMD processor, experimental results show that SIMD defragmentation achieves a mean 1.6x speedup over traditional loop vectorization and a 31% gain over prior research approaches for converting ILP to DLP.

Original languageEnglish
Title of host publicationASPLOS XVII - 17th International Conference on Architectural Support for Programming Languages and Operating Systems
Pages363-374
Number of pages12
DOIs
Publication statusPublished - 2012
Event17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012 - London, United Kingdom
Duration: 2012 Mar 32012 Mar 7

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Conference

Conference17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012
Country/TerritoryUnited Kingdom
CityLondon
Period12/3/312/3/7

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'SIMD defragmenter: Efficient ILP realization on data-parallel architectures'. Together they form a unique fingerprint.

Cite this