Optimizing top-k queries for middleware access: A unified cost-based approach

Seung Won Hwang, Kevin Chen Chuan Chang

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

This article studies optimizing top-k queries in middlewares. While many assorted algorithms have been proposed, none is generally applicable to a wide range of possible scenarios. Existing algorithms lack both the generality to support a wide range of access scenarios and the systematic adaptivity to account for runtime specifics. To fulfill this critical lacking, we aim at taking a cost-based optimization approach: By runtime search over a space of algorithms, cost-based optimization is general across a wide range of access scenarios, yet adaptive to the specific access costs at runtime. While such optimization has been taken for granted for relational queries from early on, it has been clearly lacking for ranked queries. In this article, we thus identify and address the barriers of realizing such a unified framework. As the first barrier, we need to define a comprehensive space encompassing all possibly optimal algorithms to search over. As the second barrier and a conflicting goal, such a space should also be focused enough to enable efficient search. For SQL queries that are explicitly composed of relational operators, such a space, by definition, consists of schedules of relational operators (or query plans). In contrast, top-k queries do not have logical tasks, such as relational operators. We thus define the logical tasks of top-k queries as building blocks to identify a comprehensive and focused space for top-k queries. We then develop efficient search schemes over such space for identifying the optimal algorithm. Our study indicates that our framework not only unifies, but also outperforms existing algorithms specifically designed for their scenarios.

Original languageEnglish
Article number5
JournalACM Transactions on Database Systems
Volume32
Issue number1
DOIs
Publication statusPublished - 2007 Mar 1

Fingerprint

Middleware
Costs

All Science Journal Classification (ASJC) codes

  • Information Systems

Cite this

@article{8ece7fa8f8d341a5927ddb348892c64f,
title = "Optimizing top-k queries for middleware access: A unified cost-based approach",
abstract = "This article studies optimizing top-k queries in middlewares. While many assorted algorithms have been proposed, none is generally applicable to a wide range of possible scenarios. Existing algorithms lack both the generality to support a wide range of access scenarios and the systematic adaptivity to account for runtime specifics. To fulfill this critical lacking, we aim at taking a cost-based optimization approach: By runtime search over a space of algorithms, cost-based optimization is general across a wide range of access scenarios, yet adaptive to the specific access costs at runtime. While such optimization has been taken for granted for relational queries from early on, it has been clearly lacking for ranked queries. In this article, we thus identify and address the barriers of realizing such a unified framework. As the first barrier, we need to define a comprehensive space encompassing all possibly optimal algorithms to search over. As the second barrier and a conflicting goal, such a space should also be focused enough to enable efficient search. For SQL queries that are explicitly composed of relational operators, such a space, by definition, consists of schedules of relational operators (or query plans). In contrast, top-k queries do not have logical tasks, such as relational operators. We thus define the logical tasks of top-k queries as building blocks to identify a comprehensive and focused space for top-k queries. We then develop efficient search schemes over such space for identifying the optimal algorithm. Our study indicates that our framework not only unifies, but also outperforms existing algorithms specifically designed for their scenarios.",
author = "Hwang, {Seung Won} and Chang, {Kevin Chen Chuan}",
year = "2007",
month = "3",
day = "1",
doi = "10.1145/1206049.1206054",
language = "English",
volume = "32",
journal = "ACM Transactions on Database Systems",
issn = "0362-5915",
publisher = "Association for Computing Machinery (ACM)",
number = "1",

}

Optimizing top-k queries for middleware access : A unified cost-based approach. / Hwang, Seung Won; Chang, Kevin Chen Chuan.

In: ACM Transactions on Database Systems, Vol. 32, No. 1, 5, 01.03.2007.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Optimizing top-k queries for middleware access

T2 - A unified cost-based approach

AU - Hwang, Seung Won

AU - Chang, Kevin Chen Chuan

PY - 2007/3/1

Y1 - 2007/3/1

N2 - This article studies optimizing top-k queries in middlewares. While many assorted algorithms have been proposed, none is generally applicable to a wide range of possible scenarios. Existing algorithms lack both the generality to support a wide range of access scenarios and the systematic adaptivity to account for runtime specifics. To fulfill this critical lacking, we aim at taking a cost-based optimization approach: By runtime search over a space of algorithms, cost-based optimization is general across a wide range of access scenarios, yet adaptive to the specific access costs at runtime. While such optimization has been taken for granted for relational queries from early on, it has been clearly lacking for ranked queries. In this article, we thus identify and address the barriers of realizing such a unified framework. As the first barrier, we need to define a comprehensive space encompassing all possibly optimal algorithms to search over. As the second barrier and a conflicting goal, such a space should also be focused enough to enable efficient search. For SQL queries that are explicitly composed of relational operators, such a space, by definition, consists of schedules of relational operators (or query plans). In contrast, top-k queries do not have logical tasks, such as relational operators. We thus define the logical tasks of top-k queries as building blocks to identify a comprehensive and focused space for top-k queries. We then develop efficient search schemes over such space for identifying the optimal algorithm. Our study indicates that our framework not only unifies, but also outperforms existing algorithms specifically designed for their scenarios.

AB - This article studies optimizing top-k queries in middlewares. While many assorted algorithms have been proposed, none is generally applicable to a wide range of possible scenarios. Existing algorithms lack both the generality to support a wide range of access scenarios and the systematic adaptivity to account for runtime specifics. To fulfill this critical lacking, we aim at taking a cost-based optimization approach: By runtime search over a space of algorithms, cost-based optimization is general across a wide range of access scenarios, yet adaptive to the specific access costs at runtime. While such optimization has been taken for granted for relational queries from early on, it has been clearly lacking for ranked queries. In this article, we thus identify and address the barriers of realizing such a unified framework. As the first barrier, we need to define a comprehensive space encompassing all possibly optimal algorithms to search over. As the second barrier and a conflicting goal, such a space should also be focused enough to enable efficient search. For SQL queries that are explicitly composed of relational operators, such a space, by definition, consists of schedules of relational operators (or query plans). In contrast, top-k queries do not have logical tasks, such as relational operators. We thus define the logical tasks of top-k queries as building blocks to identify a comprehensive and focused space for top-k queries. We then develop efficient search schemes over such space for identifying the optimal algorithm. Our study indicates that our framework not only unifies, but also outperforms existing algorithms specifically designed for their scenarios.

UR - http://www.scopus.com/inward/record.url?scp=33947629301&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33947629301&partnerID=8YFLogxK

U2 - 10.1145/1206049.1206054

DO - 10.1145/1206049.1206054

M3 - Article

AN - SCOPUS:33947629301

VL - 32

JO - ACM Transactions on Database Systems

JF - ACM Transactions on Database Systems

SN - 0362-5915

IS - 1

M1 - 5

ER -