A Performance-Stable NUMA Management Scheme for Linux-Based HPC Systems

Jaehyun Song, Minwoo Ahn, Gyusun Lee, Euiseong Seo, Jinkyu Jeong

    Research output: Contribution to journalArticlepeer-review


    Linux is becoming the de-facto standard operating system for today's high-performance computing (HPC) systems because it can satisfy the demands of many HPC systems for rich operating system (OS) features. However, owing to features intended for the general-purpose OS, Linux has many OS noise sources such as page faults or thread migrations that can result in the unstable performance of HPC application. Furthermore, in the case of the non-uniform memory access (NUMA) architecture, which has different memory access latencies to local and remote memory nodes, the performance stability of the application can be more exacerbated by the OS noise. In this paper, we address the OS noise caused by Linux in the NUMA architecture and propose a novel performance-stable NUMA management scheme called Stable-NUMA. Stable-NUMA comprises three techniques for improving performance stability: two-level thread clustering, state-based page placement, and selective page profiling. Our proposed Stable-NUMA scheme significantly alleviates OS noise and enhances the local memory access ratio of the NUMA system as compared to Linux. We implemented Stable-NUMA in Linux and experimented with various HPC workloads. The evaluation results demonstrated that Stable-NUMA outperforms Linux with and without its NUMA-aware feature by up to 25% in terms of average performance and 73% in terms of performance stability.

    Original languageEnglish
    Article number9391657
    Pages (from-to)52987-53002
    Number of pages16
    JournalIEEE Access
    Publication statusPublished - 2021

    Bibliographical note

    Funding Information:
    This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean Government [Ministry of Science and ICT (MSIT)] under Grant NRF-2016M3C4A7952587 and Grant NRF-2020R1A2C2102406.

    Publisher Copyright:
    © 2013 IEEE.

    All Science Journal Classification (ASJC) codes

    • Computer Science(all)
    • Materials Science(all)
    • Engineering(all)


    Dive into the research topics of 'A Performance-Stable NUMA Management Scheme for Linux-Based HPC Systems'. Together they form a unique fingerprint.

    Cite this