Machine learning based performance modeling of flash SSDs

Jaehyung Kim, Jinuk Park, Sanghyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Flash memory based solid state drives(SSDs) have alleviated the I/O bottleneck by exploiting its data parallel design. In an enterprise environment, Flash SSD used in the form of a hybrid storage architecture to achieve the better performance with lower cost. In this architecture, I/O load balancing is one of the important factors. However, the internal parallelism distorts the performance measures of the flash SSDs. Despite the criticality of load balancing on I/O intensive environments, these studies have rarely been addressed. In this paper, we examine the effectiveness of applying classification method using machine learning techniques to the I/O saturation estimation by using Linux kernel I/O statistics instead of the utilization measure that is currently used for HDDs. We conclude that machine learning techniques that we employed (Support Vector Machine and LASSO Generalized Linear Model) performs well compared to the existing utilization measure even we cannot collect the internal information of the flash SSDs.1

Original languageEnglish
Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages2135-2138
Number of pages4
ISBN (Electronic)9781450349185
DOIs
Publication statusPublished - 2017 Nov 6
Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
Duration: 2017 Nov 62017 Nov 10

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
VolumePart F131841

Other

Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017
CountrySingapore
CitySingapore
Period17/11/617/11/10

Fingerprint

Modeling
Machine learning
Load balancing
Generalized linear model
Support vector machine
Statistics
An enterprise
Factors
Linux
Kernel
Criticality
Performance measures
Costs

All Science Journal Classification (ASJC) codes

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Kim, J., Park, J., & Park, S. (2017). Machine learning based performance modeling of flash SSDs. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management (pp. 2135-2138). (International Conference on Information and Knowledge Management, Proceedings; Vol. Part F131841). Association for Computing Machinery. https://doi.org/10.1145/3132847.3133120
Kim, Jaehyung ; Park, Jinuk ; Park, Sanghyun. / Machine learning based performance modeling of flash SSDs. CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Association for Computing Machinery, 2017. pp. 2135-2138 (International Conference on Information and Knowledge Management, Proceedings).
@inproceedings{6dbcbd479de246acb138e526e6d5864a,
title = "Machine learning based performance modeling of flash SSDs",
abstract = "Flash memory based solid state drives(SSDs) have alleviated the I/O bottleneck by exploiting its data parallel design. In an enterprise environment, Flash SSD used in the form of a hybrid storage architecture to achieve the better performance with lower cost. In this architecture, I/O load balancing is one of the important factors. However, the internal parallelism distorts the performance measures of the flash SSDs. Despite the criticality of load balancing on I/O intensive environments, these studies have rarely been addressed. In this paper, we examine the effectiveness of applying classification method using machine learning techniques to the I/O saturation estimation by using Linux kernel I/O statistics instead of the utilization measure that is currently used for HDDs. We conclude that machine learning techniques that we employed (Support Vector Machine and LASSO Generalized Linear Model) performs well compared to the existing utilization measure even we cannot collect the internal information of the flash SSDs.1",
author = "Jaehyung Kim and Jinuk Park and Sanghyun Park",
year = "2017",
month = "11",
day = "6",
doi = "10.1145/3132847.3133120",
language = "English",
series = "International Conference on Information and Knowledge Management, Proceedings",
publisher = "Association for Computing Machinery",
pages = "2135--2138",
booktitle = "CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management",

}

Kim, J, Park, J & Park, S 2017, Machine learning based performance modeling of flash SSDs. in CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, vol. Part F131841, Association for Computing Machinery, pp. 2135-2138, 26th ACM International Conference on Information and Knowledge Management, CIKM 2017, Singapore, Singapore, 17/11/6. https://doi.org/10.1145/3132847.3133120

Machine learning based performance modeling of flash SSDs. / Kim, Jaehyung; Park, Jinuk; Park, Sanghyun.

CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Association for Computing Machinery, 2017. p. 2135-2138 (International Conference on Information and Knowledge Management, Proceedings; Vol. Part F131841).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Machine learning based performance modeling of flash SSDs

AU - Kim, Jaehyung

AU - Park, Jinuk

AU - Park, Sanghyun

PY - 2017/11/6

Y1 - 2017/11/6

N2 - Flash memory based solid state drives(SSDs) have alleviated the I/O bottleneck by exploiting its data parallel design. In an enterprise environment, Flash SSD used in the form of a hybrid storage architecture to achieve the better performance with lower cost. In this architecture, I/O load balancing is one of the important factors. However, the internal parallelism distorts the performance measures of the flash SSDs. Despite the criticality of load balancing on I/O intensive environments, these studies have rarely been addressed. In this paper, we examine the effectiveness of applying classification method using machine learning techniques to the I/O saturation estimation by using Linux kernel I/O statistics instead of the utilization measure that is currently used for HDDs. We conclude that machine learning techniques that we employed (Support Vector Machine and LASSO Generalized Linear Model) performs well compared to the existing utilization measure even we cannot collect the internal information of the flash SSDs.1

AB - Flash memory based solid state drives(SSDs) have alleviated the I/O bottleneck by exploiting its data parallel design. In an enterprise environment, Flash SSD used in the form of a hybrid storage architecture to achieve the better performance with lower cost. In this architecture, I/O load balancing is one of the important factors. However, the internal parallelism distorts the performance measures of the flash SSDs. Despite the criticality of load balancing on I/O intensive environments, these studies have rarely been addressed. In this paper, we examine the effectiveness of applying classification method using machine learning techniques to the I/O saturation estimation by using Linux kernel I/O statistics instead of the utilization measure that is currently used for HDDs. We conclude that machine learning techniques that we employed (Support Vector Machine and LASSO Generalized Linear Model) performs well compared to the existing utilization measure even we cannot collect the internal information of the flash SSDs.1

UR - http://www.scopus.com/inward/record.url?scp=85037329577&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037329577&partnerID=8YFLogxK

U2 - 10.1145/3132847.3133120

DO - 10.1145/3132847.3133120

M3 - Conference contribution

AN - SCOPUS:85037329577

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 2135

EP - 2138

BT - CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management

PB - Association for Computing Machinery

ER -

Kim J, Park J, Park S. Machine learning based performance modeling of flash SSDs. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Association for Computing Machinery. 2017. p. 2135-2138. (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/3132847.3133120