Abstract
We propose FlashGPU, a new GPU architecture that tightly blends new flash (Z-NAND) with massive GPU cores. Specifically, we replace global memory with Z-NAND that exhibits ultra-low latency. We also architect a flash core to manage request dispatches and address translations underneath L2 cache banks of GPU cores. While Z-NAND is a hundred times faster than conventional 3D-stacked flash, its latency is still longer than DRAM. To address this shortcoming, we propose a dynamic page-placement and buffer manager in Z-NAND subsystems by being aware of bulk and parallel memory access characteristics of GPU applications, thereby offering high-throughput and low-energy consumption behaviors.
Original language | English |
---|---|
Title of host publication | Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781450367257 |
DOIs | |
Publication status | Published - 2019 Jun 2 |
Event | 56th Annual Design Automation Conference, DAC 2019 - Las Vegas, United States Duration: 2019 Jun 2 → 2019 Jun 6 |
Publication series
Name | Proceedings - Design Automation Conference |
---|---|
ISSN (Print) | 0738-100X |
Conference
Conference | 56th Annual Design Automation Conference, DAC 2019 |
---|---|
Country/Territory | United States |
City | Las Vegas |
Period | 19/6/2 → 19/6/6 |
Bibliographical note
Funding Information:This research is mainly supported by NRF 2016R1C1B2015312, DOE DEAC02-05CH11231, IITP-2018-2017-0-01015, NRF 2015M3C4A7065645, NRF-2017R1A4A1015498 and MemRay grant. The authors thank Samsung for its Z-SSD sample donations.
Publisher Copyright:
© 2019 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Control and Systems Engineering
- Electrical and Electronic Engineering
- Modelling and Simulation