We propose FlashGPU, a new GPU architecture that tightly blends new flash (Z-NAND) with massive GPU cores. Specifically, we replace global memory with Z-NAND that exhibits ultra-low latency. We also architect a flash core to manage request dispatches and address translations underneath L2 cache banks of GPU cores. While Z-NAND is a hundred times faster than conventional 3D-stacked flash, its latency is still longer than DRAM. To address this shortcoming, we propose a dynamic page-placement and buffer manager in Z-NAND subsystems by being aware of bulk and parallel memory access characteristics of GPU applications, thereby offering high-throughput and low-energy consumption behaviors.
|Title of host publication||Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Publication status||Published - 2019 Jun 2|
|Event||56th Annual Design Automation Conference, DAC 2019 - Las Vegas, United States|
Duration: 2019 Jun 2 → 2019 Jun 6
|Name||Proceedings - Design Automation Conference|
|Conference||56th Annual Design Automation Conference, DAC 2019|
|Period||19/6/2 → 19/6/6|
Bibliographical noteFunding Information:
This research is mainly supported by NRF 2016R1C1B2015312, DOE DEAC02-05CH11231, IITP-2018-2017-0-01015, NRF 2015M3C4A7065645, NRF-2017R1A4A1015498 and MemRay grant. The authors thank Samsung for its Z-SSD sample donations.
© 2019 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Control and Systems Engineering
- Electrical and Electronic Engineering
- Modelling and Simulation