Robust distributed indexing for locality-skewed workloads

Mu Woong Lee, Seung Won Hwang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Multidimensional indexing is crucial for enabling a fast search over large-scale data. Owing to the unprecedented scale of data, extending such indexing technology has recently gained attention in distributed environments. The goal of existing efforts in distributed indexing has been the localization of queries to data residing at a small number of nodes (i.e., locality-preserving indexing) to minimize communication cost. However, considering that workloads often correlate with data locality, such indexing often generates hotspots. Location-based queries are typically skewed to disaster areas during certain periods of time, e.g., during Hurricane Irene, search traffic increased by more than 2000%. To alleviate such hotspots, we propose workload-balancing as an optimization goal. A cost model analytically supporting the need for load balancing is first developed, then a distributed index that evenly distributes the workload is presented. Our empirical study suggests that hotspots degrading search performance can be effectively alleviated. Specifically, when deployed to Amazon EC2, our proposed scheme showed maximum speed-up of 127.7%. Even in hostile settings where workload is not at all correlated with the search criteria, the proposed scheme's performance is comparable to existing approaches optimized for such settings.

Original languageEnglish
Title of host publicationCIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
Pages1342-1351
Number of pages10
DOIs
Publication statusPublished - 2012 Dec 19
Event21st ACM International Conference on Information and Knowledge Management, CIKM 2012 - Maui, HI, United States
Duration: 2012 Oct 292012 Nov 2

Other

Other21st ACM International Conference on Information and Knowledge Management, CIKM 2012
CountryUnited States
CityMaui, HI
Period12/10/2912/11/2

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Lee, M. W., & Hwang, S. W. (2012). Robust distributed indexing for locality-skewed workloads. In CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management (pp. 1342-1351) https://doi.org/10.1145/2396761.2398438