Understanding system characteristics of online erasure coding on scalable, distributed and large-scale SSD array systems

Sungjoon Koh, Jie Zhang, Miryeong Kwon, Jungyeon Yoon, David Donofrio, Nam Sung Kim, Myoungsoo Jung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Large-scale systems with arrays of solid state disks (SSDs) have become increasingly common in many computing segments. To make such systems resilient, we can adopt erasure coding such as Reed-Solomon (RS) code as an alternative to replication because erasure coding can offer a significantly lower storage cost than replication. To understand the impact of using erasure coding on system performance and other system aspects such as CPU utilization and network traffic, we build a storage cluster consisting of approximately one hundred processor cores with more than fifty high-performance SSDs, and evaluate the cluster with a popular open-source distributed parallel file system, Ceph. Then we analyze behaviors of systems adopting erasure coding from the following five viewpoints, compared with those of systems using replication: (1) storage system I/O performance; (2) computing and software overheads; (3) I/O amplification; (4) network traffic among storage nodes; (5) the impact of physical data layout on performance of RS-coded SSD arrays. For all these analyses, we examine two representative RS configurations, which are used by Google and Facebook file systems, and compare them with triple replication that a typical parallel file system employs as a default fault tolerance mechanism. Lastly, we collect 54 block-level traces from the cluster and make them available for other researchers.

Original languageEnglish
Title of host publicationProceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages76-86
Number of pages11
ISBN (Electronic)9781538612323
DOIs
Publication statusPublished - 2017 Dec 5
Event2017 IEEE International Symposium on Workload Characterization, IISWC 2017 - Seattle, United States
Duration: 2017 Oct 12017 Oct 3

Publication series

NameProceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
Volume2017-January

Other

Other2017 IEEE International Symposium on Workload Characterization, IISWC 2017
CountryUnited States
CitySeattle
Period17/10/117/10/3

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Information Systems and Management

Fingerprint Dive into the research topics of 'Understanding system characteristics of online erasure coding on scalable, distributed and large-scale SSD array systems'. Together they form a unique fingerprint.

  • Cite this

    Koh, S., Zhang, J., Kwon, M., Yoon, J., Donofrio, D., Kim, N. S., & Jung, M. (2017). Understanding system characteristics of online erasure coding on scalable, distributed and large-scale SSD array systems. In Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017 (pp. 76-86). (Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017; Vol. 2017-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IISWC.2017.8167758