Automatic categorization of query results

Kaushik Chakrabarti, Surajit Chaudhuri, Seung Won Hwang

Research output: Contribution to journalConference article

53 Citations (Scopus)

Abstract

Exploratory ad-hoc queries could return too many answers - a phenomenon commonly referred to as "information overload". In this paper, we propose to automatically categorize the results of SQL queries to address this problem. We dynamically generate a labeled, hierarchical category structure - users can determine whether a category is relevant or not by examining simply its label; she can then explore just the relevant categories and ignore the remaining ones, thereby reducing information overload. We first develop analytical models to estimate information overload faced by a user for a given exploration. Based on those models, we formulate the categorization problem as a cost optimization problem and develop heuristic algorithms to compute the min-cost categorization.

Original languageEnglish
Pages (from-to)755-766
Number of pages12
JournalProceedings of the ACM SIGMOD International Conference on Management of Data
DOIs
Publication statusPublished - 2004
EventProceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2004 - Paris, France
Duration: 2004 Jun 132004 Jun 18

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint Dive into the research topics of 'Automatic categorization of query results'. Together they form a unique fingerprint.

  • Cite this