A benchmark test of spatial big data processing tools and a mapreduce application

Minh Hieu Nguyen, Sungha Ju, Jong Won Ma, Joon Heo

Research output: Contribution to journalArticle

Abstract

Spatial data processing often poses challenges due to the unique characteristics of spatial data and this becomes more complex in spatial big data processing. Some tools have been developed and provided to users; however, they are not common for a regular user. This paper presents a benchmark test between two notable tools of spatial big data processing: GIS Tools for Hadoop and SpatialHadoop. At the same time, a MapReduce application is introduced to be used as a baseline to evaluate the effectiveness of two tools and to derive the impact of number of maps/reduces on the performance. By using these tools and New York taxi trajectory data, we perform a spatial data processing related to filtering the drop-off locations within Manhattan area. Thereby, the performance of these tools is observed with respect to increasing of data size and changing number of worker nodes. The results of this study are as follows 1) GIS Tools for Hadoop automatically creates a Quadtree index in each spatial processing. Therefore, the performance is improved significantly. However, users should be familiar with Java to handle this tool conveniently. 2) SpatialHadoop does not automatically create a spatial index for the data. As a result, its performance is much lower than GIS Tool for Hadoop on a same spatial processing. However, SpatialHadoop achieved the best result in terms of performing a range query. 3) The performance of our MapReduce application has increased four times after changing the number of reduces from 1 to 12.

Original languageEnglish
Pages (from-to)405-414
Number of pages10
JournalJournal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
Volume35
Issue number5
DOIs
Publication statusPublished - 2017 Oct

All Science Journal Classification (ASJC) codes

  • Earth and Planetary Sciences(all)

Fingerprint Dive into the research topics of 'A benchmark test of spatial big data processing tools and a mapreduce application'. Together they form a unique fingerprint.

  • Cite this