Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs

Myung Kuk Yoon, Yunho Oh, Seung Hun Kim, Sangpil Lee, Deokho Kim, Won Woo Ro

Research output: Contribution to journalArticle

Abstract

This paper conducts a detailed study of the factors affecting the operation stalls in terms of the fetch group size on the warp scheduler of GPUs. Throughout this paper, we reveal that the size of a fetch group is highly involved for hiding various types of operation stalls: Short latency stalls, long latency stalls, and Load/Store Unit (LSU) stalls. The scheduler with a small fetch group cannot hide short latency stalls due to the limited number of warps in a fetch group. In contrast, the scheduler with a large fetch group cannot hide long latency and LSU stalls due to the limited number of fetch groups and the lack of memory subsystems, respectively. To hide various types of stalls, this paper proposes a Dynamic Resizing on Active Warps (DRAW) scheduler which adjusts the size of a fetch group dynamically based on the execution phases of applications. For the applications that have the best performance at LRR (one fetch group), the DRAW scheduler matches the performance of LRR and outperforms TL (multiple fetch groups) by 22.7 percent. In addition, for the applications that have the best performance at TL, our scheduler achieves 11.0 and 5.5 percent better performance compared to LRR and TL, respectively.

Original languageEnglish
Article number7927466
Pages (from-to)3142-3156
Number of pages15
JournalIEEE Transactions on Parallel and Distributed Systems
Volume28
Issue number11
DOIs
Publication statusPublished - 2017 Nov 1

Fingerprint

Data storage equipment
Graphics processing unit

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

Yoon, Myung Kuk ; Oh, Yunho ; Kim, Seung Hun ; Lee, Sangpil ; Kim, Deokho ; Ro, Won Woo. / Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs. In: IEEE Transactions on Parallel and Distributed Systems. 2017 ; Vol. 28, No. 11. pp. 3142-3156.
@article{35865d0baba6477faffcabdbeb918ed9,
title = "Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs",
abstract = "This paper conducts a detailed study of the factors affecting the operation stalls in terms of the fetch group size on the warp scheduler of GPUs. Throughout this paper, we reveal that the size of a fetch group is highly involved for hiding various types of operation stalls: Short latency stalls, long latency stalls, and Load/Store Unit (LSU) stalls. The scheduler with a small fetch group cannot hide short latency stalls due to the limited number of warps in a fetch group. In contrast, the scheduler with a large fetch group cannot hide long latency and LSU stalls due to the limited number of fetch groups and the lack of memory subsystems, respectively. To hide various types of stalls, this paper proposes a Dynamic Resizing on Active Warps (DRAW) scheduler which adjusts the size of a fetch group dynamically based on the execution phases of applications. For the applications that have the best performance at LRR (one fetch group), the DRAW scheduler matches the performance of LRR and outperforms TL (multiple fetch groups) by 22.7 percent. In addition, for the applications that have the best performance at TL, our scheduler achieves 11.0 and 5.5 percent better performance compared to LRR and TL, respectively.",
author = "Yoon, {Myung Kuk} and Yunho Oh and Kim, {Seung Hun} and Sangpil Lee and Deokho Kim and Ro, {Won Woo}",
year = "2017",
month = "11",
day = "1",
doi = "10.1109/TPDS.2017.2704080",
language = "English",
volume = "28",
pages = "3142--3156",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "11",

}

Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs. / Yoon, Myung Kuk; Oh, Yunho; Kim, Seung Hun; Lee, Sangpil; Kim, Deokho; Ro, Won Woo.

In: IEEE Transactions on Parallel and Distributed Systems, Vol. 28, No. 11, 7927466, 01.11.2017, p. 3142-3156.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs

AU - Yoon, Myung Kuk

AU - Oh, Yunho

AU - Kim, Seung Hun

AU - Lee, Sangpil

AU - Kim, Deokho

AU - Ro, Won Woo

PY - 2017/11/1

Y1 - 2017/11/1

N2 - This paper conducts a detailed study of the factors affecting the operation stalls in terms of the fetch group size on the warp scheduler of GPUs. Throughout this paper, we reveal that the size of a fetch group is highly involved for hiding various types of operation stalls: Short latency stalls, long latency stalls, and Load/Store Unit (LSU) stalls. The scheduler with a small fetch group cannot hide short latency stalls due to the limited number of warps in a fetch group. In contrast, the scheduler with a large fetch group cannot hide long latency and LSU stalls due to the limited number of fetch groups and the lack of memory subsystems, respectively. To hide various types of stalls, this paper proposes a Dynamic Resizing on Active Warps (DRAW) scheduler which adjusts the size of a fetch group dynamically based on the execution phases of applications. For the applications that have the best performance at LRR (one fetch group), the DRAW scheduler matches the performance of LRR and outperforms TL (multiple fetch groups) by 22.7 percent. In addition, for the applications that have the best performance at TL, our scheduler achieves 11.0 and 5.5 percent better performance compared to LRR and TL, respectively.

AB - This paper conducts a detailed study of the factors affecting the operation stalls in terms of the fetch group size on the warp scheduler of GPUs. Throughout this paper, we reveal that the size of a fetch group is highly involved for hiding various types of operation stalls: Short latency stalls, long latency stalls, and Load/Store Unit (LSU) stalls. The scheduler with a small fetch group cannot hide short latency stalls due to the limited number of warps in a fetch group. In contrast, the scheduler with a large fetch group cannot hide long latency and LSU stalls due to the limited number of fetch groups and the lack of memory subsystems, respectively. To hide various types of stalls, this paper proposes a Dynamic Resizing on Active Warps (DRAW) scheduler which adjusts the size of a fetch group dynamically based on the execution phases of applications. For the applications that have the best performance at LRR (one fetch group), the DRAW scheduler matches the performance of LRR and outperforms TL (multiple fetch groups) by 22.7 percent. In addition, for the applications that have the best performance at TL, our scheduler achieves 11.0 and 5.5 percent better performance compared to LRR and TL, respectively.

UR - http://www.scopus.com/inward/record.url?scp=85032452323&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032452323&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2017.2704080

DO - 10.1109/TPDS.2017.2704080

M3 - Article

AN - SCOPUS:85032452323

VL - 28

SP - 3142

EP - 3156

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 11

M1 - 7927466

ER -