Profile-guided deployment of stream programs on multicores

S. M. Farhad, Yousun Ko, bernd Burgstaller, Bernhard Scholz

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Because multicore architectures have become the industry standard, programming abstractions for concurrent programming are of key importance. Stream programming languages facilitate application domains characterized by regular sequences of data, such as multimedia, graphics, signal processing and networking. With stream programs, computations are expressed through independent actors that interact through FIFO data channels. A major challenge with stream programs is to load-balance actors among available processing cores. The workload of a stream program is determined by actor execution times and the communication overhead induced by data channels. Estimating communication costs on cache-coherent shared-memory multiprocessors is difficult, because data movements are abstracted away by the cache coherence protocol. Standard execution time profiling techniques cannot separate actor execution times from communication costs, because communication costs manifest in terms of execution time overhead. In this work we present a unified Integer Linear Programming (ILP) formulation that balances the workload of stream programs on cache-coherent multicore architectures. For estimating the communication costs of data channels, we devise a novel profiling scheme that minimizes the number of profiling steps. We conduct experiments across a range of StreamIt benchmarks and show that our method achieves a speedup of up to 4.02x on 6 processors. The number of profiling steps is on average only 17% of an exhaustive profiling run over all data channels of a stream program.

Original languageEnglish
Pages (from-to)79-88
Number of pages10
JournalACM SIGPLAN Notices
Volume47
Issue number5
Publication statusPublished - 2012 May 1

Fingerprint

Communication
Costs
Computer programming
Computer programming languages
Linear programming
Signal processing
Network protocols
Data storage equipment
Processing
Industry
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Farhad, S. M. ; Ko, Yousun ; Burgstaller, bernd ; Scholz, Bernhard. / Profile-guided deployment of stream programs on multicores. In: ACM SIGPLAN Notices. 2012 ; Vol. 47, No. 5. pp. 79-88.
@article{23f7abc9c09c4baf85cfc4b410e8dc9f,
title = "Profile-guided deployment of stream programs on multicores",
abstract = "Because multicore architectures have become the industry standard, programming abstractions for concurrent programming are of key importance. Stream programming languages facilitate application domains characterized by regular sequences of data, such as multimedia, graphics, signal processing and networking. With stream programs, computations are expressed through independent actors that interact through FIFO data channels. A major challenge with stream programs is to load-balance actors among available processing cores. The workload of a stream program is determined by actor execution times and the communication overhead induced by data channels. Estimating communication costs on cache-coherent shared-memory multiprocessors is difficult, because data movements are abstracted away by the cache coherence protocol. Standard execution time profiling techniques cannot separate actor execution times from communication costs, because communication costs manifest in terms of execution time overhead. In this work we present a unified Integer Linear Programming (ILP) formulation that balances the workload of stream programs on cache-coherent multicore architectures. For estimating the communication costs of data channels, we devise a novel profiling scheme that minimizes the number of profiling steps. We conduct experiments across a range of StreamIt benchmarks and show that our method achieves a speedup of up to 4.02x on 6 processors. The number of profiling steps is on average only 17{\%} of an exhaustive profiling run over all data channels of a stream program.",
author = "Farhad, {S. M.} and Yousun Ko and bernd Burgstaller and Bernhard Scholz",
year = "2012",
month = "5",
day = "1",
language = "English",
volume = "47",
pages = "79--88",
journal = "ACM SIGPLAN Notices",
issn = "1523-2867",
publisher = "Association for Computing Machinery (ACM)",
number = "5",

}

Farhad, SM, Ko, Y, Burgstaller, B & Scholz, B 2012, 'Profile-guided deployment of stream programs on multicores', ACM SIGPLAN Notices, vol. 47, no. 5, pp. 79-88.

Profile-guided deployment of stream programs on multicores. / Farhad, S. M.; Ko, Yousun; Burgstaller, bernd; Scholz, Bernhard.

In: ACM SIGPLAN Notices, Vol. 47, No. 5, 01.05.2012, p. 79-88.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Profile-guided deployment of stream programs on multicores

AU - Farhad, S. M.

AU - Ko, Yousun

AU - Burgstaller, bernd

AU - Scholz, Bernhard

PY - 2012/5/1

Y1 - 2012/5/1

N2 - Because multicore architectures have become the industry standard, programming abstractions for concurrent programming are of key importance. Stream programming languages facilitate application domains characterized by regular sequences of data, such as multimedia, graphics, signal processing and networking. With stream programs, computations are expressed through independent actors that interact through FIFO data channels. A major challenge with stream programs is to load-balance actors among available processing cores. The workload of a stream program is determined by actor execution times and the communication overhead induced by data channels. Estimating communication costs on cache-coherent shared-memory multiprocessors is difficult, because data movements are abstracted away by the cache coherence protocol. Standard execution time profiling techniques cannot separate actor execution times from communication costs, because communication costs manifest in terms of execution time overhead. In this work we present a unified Integer Linear Programming (ILP) formulation that balances the workload of stream programs on cache-coherent multicore architectures. For estimating the communication costs of data channels, we devise a novel profiling scheme that minimizes the number of profiling steps. We conduct experiments across a range of StreamIt benchmarks and show that our method achieves a speedup of up to 4.02x on 6 processors. The number of profiling steps is on average only 17% of an exhaustive profiling run over all data channels of a stream program.

AB - Because multicore architectures have become the industry standard, programming abstractions for concurrent programming are of key importance. Stream programming languages facilitate application domains characterized by regular sequences of data, such as multimedia, graphics, signal processing and networking. With stream programs, computations are expressed through independent actors that interact through FIFO data channels. A major challenge with stream programs is to load-balance actors among available processing cores. The workload of a stream program is determined by actor execution times and the communication overhead induced by data channels. Estimating communication costs on cache-coherent shared-memory multiprocessors is difficult, because data movements are abstracted away by the cache coherence protocol. Standard execution time profiling techniques cannot separate actor execution times from communication costs, because communication costs manifest in terms of execution time overhead. In this work we present a unified Integer Linear Programming (ILP) formulation that balances the workload of stream programs on cache-coherent multicore architectures. For estimating the communication costs of data channels, we devise a novel profiling scheme that minimizes the number of profiling steps. We conduct experiments across a range of StreamIt benchmarks and show that our method achieves a speedup of up to 4.02x on 6 processors. The number of profiling steps is on average only 17% of an exhaustive profiling run over all data channels of a stream program.

UR - http://www.scopus.com/inward/record.url?scp=84866339152&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866339152&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84866339152

VL - 47

SP - 79

EP - 88

JO - ACM SIGPLAN Notices

JF - ACM SIGPLAN Notices

SN - 1523-2867

IS - 5

ER -