Graphics processors evolve rapidly and promise to support power-efficient, cost, differentiated price-performance, and scalable high performance computing. MapReduce is a well-known distributed programming model to ease the development of applications for large-scale data processing on a large number of commodity CPUs. When compared to CPUs, GPUs are an order of magnitude faster in terms of computation power and memory bandwidth, but they are harder to program. Although several studies have implemented the MapReduce model on GPUs, most of them are based on the single GPU model and bounded by a GPU memory with inefficient atomic operations. This paper focuses on the development of MGMR, a standalone MapReduce system that utilizes multiple GPUs to manage large-scale data processing beyond the GPU memory limitation, and also to eliminate serial atomic operations. Experimental results have demonstrated the effectiveness of MGMR in handling large data sets.
Bibliographical noteFunding Information:
Acknowledgements Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. This research is based upon work partially supported by National Science Foundation, USA (Awards No. 0918970 and CCF-1065448), National Science Council (NSC), Taiwan, under grants NSC101-2221-E-126-002 and NSC101-2915-I-126-001, and NVIDIA.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications