Personalized top-k skyline queries in high-dimensional space

Jongwuk Lee, Gae won You, Seung won Hwang

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

As data of an unprecedented scale are becoming accessible, it becomes more and more important to help each user identify the ideal results of a manageable size. As such a mechanism, skyline queries have recently attracted a lot of attention for its intuitive query formulation. This intuitiveness, however, has a side effect of retrieving too many results, especially for high-dimensional data. This paper is to support personalized skyline queries as identifying "truly interesting" objects based on user-specific preference and retrieval size k. In particular, we abstract personalized skyline ranking as a dynamic search over skyline subspaces guided by user-specific preference. We then develop a novel algorithm navigating on a compressed structure itself, to reduce the storage overhead. Furthermore, we also develop novel techniques to interleave cube construction with navigation for some scenarios without a priori structure. Finally, we extend the proposed techniques for user-specific preferences including equivalence preference. Our extensive evaluation results validate the effectiveness and efficiency of the proposed algorithms on both real-life and synthetic data.

Original languageEnglish
Pages (from-to)45-61
Number of pages17
JournalInformation Systems
Volume34
Issue number1
DOIs
Publication statusPublished - 2009 Mar 1

Fingerprint

Navigation

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Information Systems
  • Software

Cite this

Lee, Jongwuk ; You, Gae won ; Hwang, Seung won. / Personalized top-k skyline queries in high-dimensional space. In: Information Systems. 2009 ; Vol. 34, No. 1. pp. 45-61.
@article{22a6c07a1e5349c48a5e23f49e4999c4,
title = "Personalized top-k skyline queries in high-dimensional space",
abstract = "As data of an unprecedented scale are becoming accessible, it becomes more and more important to help each user identify the ideal results of a manageable size. As such a mechanism, skyline queries have recently attracted a lot of attention for its intuitive query formulation. This intuitiveness, however, has a side effect of retrieving too many results, especially for high-dimensional data. This paper is to support personalized skyline queries as identifying {"}truly interesting{"} objects based on user-specific preference and retrieval size k. In particular, we abstract personalized skyline ranking as a dynamic search over skyline subspaces guided by user-specific preference. We then develop a novel algorithm navigating on a compressed structure itself, to reduce the storage overhead. Furthermore, we also develop novel techniques to interleave cube construction with navigation for some scenarios without a priori structure. Finally, we extend the proposed techniques for user-specific preferences including equivalence preference. Our extensive evaluation results validate the effectiveness and efficiency of the proposed algorithms on both real-life and synthetic data.",
author = "Jongwuk Lee and You, {Gae won} and Hwang, {Seung won}",
year = "2009",
month = "3",
day = "1",
doi = "10.1016/j.is.2008.04.004",
language = "English",
volume = "34",
pages = "45--61",
journal = "Information Systems",
issn = "0306-4379",
publisher = "Elsevier Limited",
number = "1",

}

Personalized top-k skyline queries in high-dimensional space. / Lee, Jongwuk; You, Gae won; Hwang, Seung won.

In: Information Systems, Vol. 34, No. 1, 01.03.2009, p. 45-61.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Personalized top-k skyline queries in high-dimensional space

AU - Lee, Jongwuk

AU - You, Gae won

AU - Hwang, Seung won

PY - 2009/3/1

Y1 - 2009/3/1

N2 - As data of an unprecedented scale are becoming accessible, it becomes more and more important to help each user identify the ideal results of a manageable size. As such a mechanism, skyline queries have recently attracted a lot of attention for its intuitive query formulation. This intuitiveness, however, has a side effect of retrieving too many results, especially for high-dimensional data. This paper is to support personalized skyline queries as identifying "truly interesting" objects based on user-specific preference and retrieval size k. In particular, we abstract personalized skyline ranking as a dynamic search over skyline subspaces guided by user-specific preference. We then develop a novel algorithm navigating on a compressed structure itself, to reduce the storage overhead. Furthermore, we also develop novel techniques to interleave cube construction with navigation for some scenarios without a priori structure. Finally, we extend the proposed techniques for user-specific preferences including equivalence preference. Our extensive evaluation results validate the effectiveness and efficiency of the proposed algorithms on both real-life and synthetic data.

AB - As data of an unprecedented scale are becoming accessible, it becomes more and more important to help each user identify the ideal results of a manageable size. As such a mechanism, skyline queries have recently attracted a lot of attention for its intuitive query formulation. This intuitiveness, however, has a side effect of retrieving too many results, especially for high-dimensional data. This paper is to support personalized skyline queries as identifying "truly interesting" objects based on user-specific preference and retrieval size k. In particular, we abstract personalized skyline ranking as a dynamic search over skyline subspaces guided by user-specific preference. We then develop a novel algorithm navigating on a compressed structure itself, to reduce the storage overhead. Furthermore, we also develop novel techniques to interleave cube construction with navigation for some scenarios without a priori structure. Finally, we extend the proposed techniques for user-specific preferences including equivalence preference. Our extensive evaluation results validate the effectiveness and efficiency of the proposed algorithms on both real-life and synthetic data.

UR - http://www.scopus.com/inward/record.url?scp=55549137378&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=55549137378&partnerID=8YFLogxK

U2 - 10.1016/j.is.2008.04.004

DO - 10.1016/j.is.2008.04.004

M3 - Article

VL - 34

SP - 45

EP - 61

JO - Information Systems

JF - Information Systems

SN - 0306-4379

IS - 1

ER -