Delayed-dynamic-selective (DDS) prediction for reducing extreme tail latency in web search

Saehoon Kim, Sameh Elnikety, Yuxiong He, Seungjin Choi, Seung Won Hwang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Citations (Scopus)

Abstract

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel, otherwise it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99thpercentile), which we call extreme tail latency. We propose a prediction framework to reduce the extreme tail latency of search servers. The framework has a unique set of characteristics to predict long-running queries with high recall and improved precision. Specifically, prediction is delayed by a short duration to allow many short-running queries to complete without parallelization, and to allow the predictor to collect a set of dynamic features using runtime information. These features estimate query execution time with high accuracy. We also use them to estimate the prediction errors to override an uncertain prediction by selectively accelerating the query for a higher recall. We evaluate the proposed prediction framework to improve search engine performance in two scenarios using a simulation study: (1) query parallelization on a multicore processor, and (2) query scheduling on a heterogeneous processor. The results show that, for both scenarios, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

Original languageEnglish
Title of host publicationWSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining
PublisherAssociation for Computing Machinery, Inc
Pages7-16
Number of pages10
ISBN (Electronic)9781450333177
DOIs
Publication statusPublished - 2015 Feb 2
Event8th ACM International Conference on Web Search and Data Mining, WSDM 2015 - Shanghai, China
Duration: 2015 Jan 312015 Feb 6

Publication series

NameWSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining

Other

Other8th ACM International Conference on Web Search and Data Mining, WSDM 2015
CountryChina
CityShanghai
Period15/1/3115/2/6

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Delayed-dynamic-selective (DDS) prediction for reducing extreme tail latency in web search'. Together they form a unique fingerprint.

  • Cite this

    Kim, S., Elnikety, S., He, Y., Choi, S., & Hwang, S. W. (2015). Delayed-dynamic-selective (DDS) prediction for reducing extreme tail latency in web search. In WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining (pp. 7-16). (WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining). Association for Computing Machinery, Inc. https://doi.org/10.1145/2684822.2685289