Optimizing hash partitioning for solid state drives

Mincheol Shin, Hongchan Roh, Wonmook Jung, Sang Hyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The use of flashSSDs has increased rapidly in a wide range of areas due to their superior energy efficiency, shorter access time, and higher bandwidth when compared to HDDs. The internal parallelism created by multiple flash memory packages embedded in a flashSSDs, is one of the unique features of flashSSDs. Many new DBMS technologies have been developed for flashSSDs, but query processing for flashSSDs have drawn less attention than other DBMS technologies. Hash partitioning is popularly used in query processing algorithms to materialize their intermediate results in an efficient manner. In this paper, we propose a novel hash partitioning algorithm that exploits the internal parallelism of flashSSDs. The devised hash partitioning method outperforms the traditional hash partitioning technique regardless of the amount of available main memory independently from the buffer management strategies (blocked I/O vs page sized I/O). We implemented our method based on the source code of the PostgreSQL storage manager. PostgreSQL relation files created by the TPC-H workload were employed in the experiments. Our method was found to be up to 3.55 times faster than the traditional method with blocked I/O, and 2.36 times faster than the traditional method with pagesized I/O.

Original languageEnglish
Title of host publication2016 Symposium on Applied Computing, SAC 2016
PublisherAssociation for Computing Machinery
Pages1000-1007
Number of pages8
ISBN (Electronic)9781450337397
DOIs
Publication statusPublished - 2016 Apr 4
Event31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy
Duration: 2016 Apr 42016 Apr 8

Publication series

NameProceedings of the ACM Symposium on Applied Computing
Volume04-08-April-2016

Other

Other31st Annual ACM Symposium on Applied Computing, SAC 2016
CountryItaly
CityPisa
Period16/4/416/4/8

Fingerprint

Query processing
Flash memory
Energy efficiency
Managers
Bandwidth
Data storage equipment
Experiments

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Shin, M., Roh, H., Jung, W., & Park, S. H. (2016). Optimizing hash partitioning for solid state drives. In 2016 Symposium on Applied Computing, SAC 2016 (pp. 1000-1007). (Proceedings of the ACM Symposium on Applied Computing; Vol. 04-08-April-2016). Association for Computing Machinery. https://doi.org/10.1145/2851613.2851663
Shin, Mincheol ; Roh, Hongchan ; Jung, Wonmook ; Park, Sang Hyun. / Optimizing hash partitioning for solid state drives. 2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery, 2016. pp. 1000-1007 (Proceedings of the ACM Symposium on Applied Computing).
@inproceedings{aee3beb7f6e743c4b88f51eb951f888d,
title = "Optimizing hash partitioning for solid state drives",
abstract = "The use of flashSSDs has increased rapidly in a wide range of areas due to their superior energy efficiency, shorter access time, and higher bandwidth when compared to HDDs. The internal parallelism created by multiple flash memory packages embedded in a flashSSDs, is one of the unique features of flashSSDs. Many new DBMS technologies have been developed for flashSSDs, but query processing for flashSSDs have drawn less attention than other DBMS technologies. Hash partitioning is popularly used in query processing algorithms to materialize their intermediate results in an efficient manner. In this paper, we propose a novel hash partitioning algorithm that exploits the internal parallelism of flashSSDs. The devised hash partitioning method outperforms the traditional hash partitioning technique regardless of the amount of available main memory independently from the buffer management strategies (blocked I/O vs page sized I/O). We implemented our method based on the source code of the PostgreSQL storage manager. PostgreSQL relation files created by the TPC-H workload were employed in the experiments. Our method was found to be up to 3.55 times faster than the traditional method with blocked I/O, and 2.36 times faster than the traditional method with pagesized I/O.",
author = "Mincheol Shin and Hongchan Roh and Wonmook Jung and Park, {Sang Hyun}",
year = "2016",
month = "4",
day = "4",
doi = "10.1145/2851613.2851663",
language = "English",
series = "Proceedings of the ACM Symposium on Applied Computing",
publisher = "Association for Computing Machinery",
pages = "1000--1007",
booktitle = "2016 Symposium on Applied Computing, SAC 2016",

}

Shin, M, Roh, H, Jung, W & Park, SH 2016, Optimizing hash partitioning for solid state drives. in 2016 Symposium on Applied Computing, SAC 2016. Proceedings of the ACM Symposium on Applied Computing, vol. 04-08-April-2016, Association for Computing Machinery, pp. 1000-1007, 31st Annual ACM Symposium on Applied Computing, SAC 2016, Pisa, Italy, 16/4/4. https://doi.org/10.1145/2851613.2851663

Optimizing hash partitioning for solid state drives. / Shin, Mincheol; Roh, Hongchan; Jung, Wonmook; Park, Sang Hyun.

2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery, 2016. p. 1000-1007 (Proceedings of the ACM Symposium on Applied Computing; Vol. 04-08-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Optimizing hash partitioning for solid state drives

AU - Shin, Mincheol

AU - Roh, Hongchan

AU - Jung, Wonmook

AU - Park, Sang Hyun

PY - 2016/4/4

Y1 - 2016/4/4

N2 - The use of flashSSDs has increased rapidly in a wide range of areas due to their superior energy efficiency, shorter access time, and higher bandwidth when compared to HDDs. The internal parallelism created by multiple flash memory packages embedded in a flashSSDs, is one of the unique features of flashSSDs. Many new DBMS technologies have been developed for flashSSDs, but query processing for flashSSDs have drawn less attention than other DBMS technologies. Hash partitioning is popularly used in query processing algorithms to materialize their intermediate results in an efficient manner. In this paper, we propose a novel hash partitioning algorithm that exploits the internal parallelism of flashSSDs. The devised hash partitioning method outperforms the traditional hash partitioning technique regardless of the amount of available main memory independently from the buffer management strategies (blocked I/O vs page sized I/O). We implemented our method based on the source code of the PostgreSQL storage manager. PostgreSQL relation files created by the TPC-H workload were employed in the experiments. Our method was found to be up to 3.55 times faster than the traditional method with blocked I/O, and 2.36 times faster than the traditional method with pagesized I/O.

AB - The use of flashSSDs has increased rapidly in a wide range of areas due to their superior energy efficiency, shorter access time, and higher bandwidth when compared to HDDs. The internal parallelism created by multiple flash memory packages embedded in a flashSSDs, is one of the unique features of flashSSDs. Many new DBMS technologies have been developed for flashSSDs, but query processing for flashSSDs have drawn less attention than other DBMS technologies. Hash partitioning is popularly used in query processing algorithms to materialize their intermediate results in an efficient manner. In this paper, we propose a novel hash partitioning algorithm that exploits the internal parallelism of flashSSDs. The devised hash partitioning method outperforms the traditional hash partitioning technique regardless of the amount of available main memory independently from the buffer management strategies (blocked I/O vs page sized I/O). We implemented our method based on the source code of the PostgreSQL storage manager. PostgreSQL relation files created by the TPC-H workload were employed in the experiments. Our method was found to be up to 3.55 times faster than the traditional method with blocked I/O, and 2.36 times faster than the traditional method with pagesized I/O.

UR - http://www.scopus.com/inward/record.url?scp=84975853469&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975853469&partnerID=8YFLogxK

U2 - 10.1145/2851613.2851663

DO - 10.1145/2851613.2851663

M3 - Conference contribution

T3 - Proceedings of the ACM Symposium on Applied Computing

SP - 1000

EP - 1007

BT - 2016 Symposium on Applied Computing, SAC 2016

PB - Association for Computing Machinery

ER -

Shin M, Roh H, Jung W, Park SH. Optimizing hash partitioning for solid state drives. In 2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery. 2016. p. 1000-1007. (Proceedings of the ACM Symposium on Applied Computing). https://doi.org/10.1145/2851613.2851663