External Mergesort for Flash-Based Solid State Drives

Joonhee Lee, Hongchan Roh, Sanghyun Park

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Mergesort is the most widely-known external sorting algorithm, which is used when the data being sorted do not fit into the available main memory. There have been several attempts to improve mergesort by reducing I/O time, since mergesort is I/O intensive. However, these methods assumed that mergesort runs on hard disk drives (HDDs). Flash-based solid state drives (SSDs) are emerging as next generation storage devices and becoming alternatives to HDDs. SSDs outperform HDDs in access latency, because they have no physical arms to move. In addition, SSDs benefit from their inner structure by exploiting internal parallelism, resulting in high I/O bandwidth. Previous methods for improving mergesort focused on reducing random access cost, which is insignificant on SSDs. In this paper we propose an external mergesort algorithm for SSDs called FMsort. FMsort calculates a block read order which is the order of blocks needed in the merge phase. With a block read order, a number of blocks required during the merge phase are read into main memory via multiple asynchronous I/Os. Our experiments show that FMsort outperforms other mergesort algorithms, at an invisible cost of calculating a block read order.

Original languageEnglish
Article number7145402
Pages (from-to)1518-1527
Number of pages10
JournalIEEE Transactions on Computers
Volume65
Issue number5
DOIs
Publication statusPublished - 2016 May 1

Fingerprint

Flash-based SSDs
Flash
Hard disk storage
Data storage equipment
Sorting
Sorting algorithm
Random Access
Costs
Parallelism
Latency
Bandwidth
Internal
Calculate
Alternatives
Experiments
Experiment

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

Lee, Joonhee ; Roh, Hongchan ; Park, Sanghyun. / External Mergesort for Flash-Based Solid State Drives. In: IEEE Transactions on Computers. 2016 ; Vol. 65, No. 5. pp. 1518-1527.
@article{3db4dbd6ea3c4ebf8d27f29a8b195a72,
title = "External Mergesort for Flash-Based Solid State Drives",
abstract = "Mergesort is the most widely-known external sorting algorithm, which is used when the data being sorted do not fit into the available main memory. There have been several attempts to improve mergesort by reducing I/O time, since mergesort is I/O intensive. However, these methods assumed that mergesort runs on hard disk drives (HDDs). Flash-based solid state drives (SSDs) are emerging as next generation storage devices and becoming alternatives to HDDs. SSDs outperform HDDs in access latency, because they have no physical arms to move. In addition, SSDs benefit from their inner structure by exploiting internal parallelism, resulting in high I/O bandwidth. Previous methods for improving mergesort focused on reducing random access cost, which is insignificant on SSDs. In this paper we propose an external mergesort algorithm for SSDs called FMsort. FMsort calculates a block read order which is the order of blocks needed in the merge phase. With a block read order, a number of blocks required during the merge phase are read into main memory via multiple asynchronous I/Os. Our experiments show that FMsort outperforms other mergesort algorithms, at an invisible cost of calculating a block read order.",
author = "Joonhee Lee and Hongchan Roh and Sanghyun Park",
year = "2016",
month = "5",
day = "1",
doi = "10.1109/TC.2015.2451631",
language = "English",
volume = "65",
pages = "1518--1527",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE Computer Society",
number = "5",

}

External Mergesort for Flash-Based Solid State Drives. / Lee, Joonhee; Roh, Hongchan; Park, Sanghyun.

In: IEEE Transactions on Computers, Vol. 65, No. 5, 7145402, 01.05.2016, p. 1518-1527.

Research output: Contribution to journalArticle

TY - JOUR

T1 - External Mergesort for Flash-Based Solid State Drives

AU - Lee, Joonhee

AU - Roh, Hongchan

AU - Park, Sanghyun

PY - 2016/5/1

Y1 - 2016/5/1

N2 - Mergesort is the most widely-known external sorting algorithm, which is used when the data being sorted do not fit into the available main memory. There have been several attempts to improve mergesort by reducing I/O time, since mergesort is I/O intensive. However, these methods assumed that mergesort runs on hard disk drives (HDDs). Flash-based solid state drives (SSDs) are emerging as next generation storage devices and becoming alternatives to HDDs. SSDs outperform HDDs in access latency, because they have no physical arms to move. In addition, SSDs benefit from their inner structure by exploiting internal parallelism, resulting in high I/O bandwidth. Previous methods for improving mergesort focused on reducing random access cost, which is insignificant on SSDs. In this paper we propose an external mergesort algorithm for SSDs called FMsort. FMsort calculates a block read order which is the order of blocks needed in the merge phase. With a block read order, a number of blocks required during the merge phase are read into main memory via multiple asynchronous I/Os. Our experiments show that FMsort outperforms other mergesort algorithms, at an invisible cost of calculating a block read order.

AB - Mergesort is the most widely-known external sorting algorithm, which is used when the data being sorted do not fit into the available main memory. There have been several attempts to improve mergesort by reducing I/O time, since mergesort is I/O intensive. However, these methods assumed that mergesort runs on hard disk drives (HDDs). Flash-based solid state drives (SSDs) are emerging as next generation storage devices and becoming alternatives to HDDs. SSDs outperform HDDs in access latency, because they have no physical arms to move. In addition, SSDs benefit from their inner structure by exploiting internal parallelism, resulting in high I/O bandwidth. Previous methods for improving mergesort focused on reducing random access cost, which is insignificant on SSDs. In this paper we propose an external mergesort algorithm for SSDs called FMsort. FMsort calculates a block read order which is the order of blocks needed in the merge phase. With a block read order, a number of blocks required during the merge phase are read into main memory via multiple asynchronous I/Os. Our experiments show that FMsort outperforms other mergesort algorithms, at an invisible cost of calculating a block read order.

UR - http://www.scopus.com/inward/record.url?scp=84963861037&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963861037&partnerID=8YFLogxK

U2 - 10.1109/TC.2015.2451631

DO - 10.1109/TC.2015.2451631

M3 - Article

AN - SCOPUS:84963861037

VL - 65

SP - 1518

EP - 1527

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 5

M1 - 7145402

ER -