Enhancing network I/o performance for a virtualized Hadoop cluster

Jinkyu Jeong, Dong Hoon Choi, Heeseung Jo

    Research output: Contribution to journalArticlepeer-review

    Abstract

    A MapReduce programming model is proposed to process big data using Hadoop, one of the major cloud computing frameworks. With the increasing adoption of cloud computing, running a Hadoop framework on a virtualized cluster is a compelling approach to reducing costs and increasing efficiency. In this paper, we measure the performance of a virtualized network and analyze the impact of network performance on Hadoop workloads running on a virtualized cluster. Then, we propose a virtualized network I/O architecture as a novel optimization for a virtualized Hadoop cluster for a public/private cloud provider. The proposed network architecture combines traditional network configurations and achieves better performance for Hadoop workloads. We also show a better way to utilize the rack awareness feature of the Hadoop framework in the proposed computing environment. The evaluation demonstrates that the proposed network architecture and mechanisms improve performance by up to 4.1 times compared with a bridge network architecture. This novel architecture can even virtually match the performance of the expensive, hardware-based single root I/O virtualization network architecture.

    Original languageEnglish
    Article numbere3974
    JournalConcurrency Computation Practice and Experience
    Volume29
    Issue number8
    DOIs
    Publication statusPublished - 2017 Apr 25

    Bibliographical note

    Funding Information:
    This work was supported by the ICT R&D program of MSIP/IITP (B0101-15-0644, The research project on High Performance and Scalable Manycore Operating System) and the Institute for Information and Communications Technology Promotion (IITP) grant funded by the Korean government (MSIP) (no. B0101-15-0104, The Development of Supercomputing System for the Genome Analysis). This paper was supported by research funds of Chonbuk National University in 2015. †Correction added on 17 February 2017, after first online publication: this sentence was moved from the Conclusion and Future Work section to the Acknowledgements.

    Publisher Copyright:
    Copyright © 2016 John Wiley & Sons, Ltd.

    All Science Journal Classification (ASJC) codes

    • Theoretical Computer Science
    • Software
    • Computer Science Applications
    • Computer Networks and Communications
    • Computational Theory and Mathematics

    Fingerprint

    Dive into the research topics of 'Enhancing network I/o performance for a virtualized Hadoop cluster'. Together they form a unique fingerprint.

    Cite this