Thanks to massive parallelism in modern Graphics Processing Units (GPUs), emerging data processing applications in GPU computing exhibit ten-fold speedups compared to CPU-only systems. However, this GPU-based acceleration is limited in many cases by the significant data movement overheads and inefficient memory management for host-side storage accesses. To address these shortcomings, this paper proposes a non-volatile memory management unit (NVMMU) that reduces the file datamovement overheads by directly connecting the Solid State Disk (SSD) to the GPU. We implemented our proposed NVMMU on a real hardware with commercially available GPU and SSD devices by considering different types of storage interfaces and configurations. In this work, NVMMU unifies two discrete software stacks (one for the SSD and other for the GPU) in two major ways. While a new interface provided by our NVMMU directly forwards file data between the GPU runtime library and the I/O runtime library, it supports non-volatile direct memory access (NDMA) that pairs those GPU and SSD devices via physically shared system memory blocks. This unification in turn can eliminate unnecessary user/kernel-mode switching, improve memory management, and remove data copy overheads. Our evaluation results demonstrate that NVMMU can reduce the overheads of file data movement by 95% on average, improving overall system performance by 78% compared to a conventional IOMMU approach.
|Number of pages||12|
|Journal||Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT|
|Publication status||Published - 2015|
|Event||24th International Conference on Parallel Architecture and Compilation, PACT 2015 - San Francisco, United States|
Duration: 2015 Oct 18 → 2015 Oct 21
Bibliographical noteFunding Information:
This research is supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the "IT Consilience Creative Program" (IITP-2015-R0346-15-1008) supervised by the NIPA (National IT Industry Promotion Agency). This work is also supported in part by DOE grant DE-AC02-05CH1123, NSF grants 1213052, 1205618, 1302557, 1526750, 1409095, and 1439021.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture