Modern GPUs suffer from cache contention due to the limited cache size that is shared across tens of concurrently running warps. To increase the per-warp cache size prior techniques proposed warp throttling which limits the number of active warps. Warp throttling leaves several registers to be dynamically unused whenever a warp is throttled. Given the stringent cache size limitation in GPUs this work proposes a new cache management technique named Linebacker (LB) that improves GPU performance by utilizing idle register file space as victim cache space. Whenever a CTA becomes inactive, linebacker backs up the registers of the throttled CTA to the off-chip memory. Then, linebacker utilizes the corresponding register file space as victim cache space. If any load instruction finds data in the victim cache line, the data is directly copied to the destination register through a simple register-register move operation. To further improve the efficiency of victim cache linebacker allocates victim cache space only to a select few load instructions that exhibit high data locality. Through a careful design of victim cache indexing and management scheme linebacker provides 29.0% of speedup compared to the previously proposed warp throttling techniques.
|Title of host publication||ISCA 2019 - Proceedings of the 2019 46th International Symposium on Computer Architecture|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||14|
|Publication status||Published - 2019 Jun 22|
|Event||46th International Symposium on Computer Architecture, ISCA 2019 - Phoenix, United States|
Duration: 2019 Jun 22 → 2019 Jun 26
|Name||Proceedings - International Symposium on Computer Architecture|
|Conference||46th International Symposium on Computer Architecture, ISCA 2019|
|Period||19/6/22 → 19/6/26|
Bibliographical noteFunding Information:
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2018R1A2A2A05018941), by Institute of Information & Communication Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00533, Research on CPU vulnerability detection and validation), and by Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001117C0053, NSF grants 1719074. W. W. Ro is the corresponding author.
All Science Journal Classification (ASJC) codes
- Hardware and Architecture