Selective I/O bypass and load balancing method for write-through SSD caching in big data analytics

Jaehyung Kim, Hongchan Roh, Sanghyun Park

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Fast network quality analysis in the telecom industry is an important method used to provide quality service. SK Telecom, based in South Korea, built a Hadoop-based analytical system consisting of a hundred nodes, each of which only contains hard disk drives (HDDs). Because the analysis process is a set of parallel I/O intensive jobs, adding solid state drives (SSDs) with appropriate settings is the most cost-efficient way to improve the performance, as shown in previous studies. Therefore, we decided to configure SSDs as a write-through cache instead of increasing the number of HDDs. To improve the cost-per-performance of the SSD cache, we introduced a selective I/O bypass (SIB) method, redirecting the automatically calculated number of read I/O requests from the SSD cache to idle HDDs when the SSDs are I/O over-saturated, which means the disk utilization is greater than 100 percent. To precisely calculate the disk utilization, we also introduced a combinational approach for SSDs because the current method used for HDDs cannot be applied to SSDs because of their internal parallelism. In our experiments, the proposed approach achieved a maximum 2x faster performance than other approaches.

Original languageEnglish
Pages (from-to)589-595
Number of pages7
JournalIEEE Transactions on Computers
Volume67
Issue number4
DOIs
Publication statusPublished - 2018 Apr 1

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Selective I/O bypass and load balancing method for write-through SSD caching in big data analytics'. Together they form a unique fingerprint.

  • Cite this