Grid-based subspace clustering over data streams

Nam Hun Park, Won Suk Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Citations (Scopus)

Abstract

A real-life data stream usually contains many dimensions and some dimensional values of its data elements may be missing. In order to effectively extract the on-going change of a data stream with respect to all the subsets of the dimensions of the data stream, a grid-based subspace clustering algorithm is proposed in this paper. Given an n-dimensional data stream, the on-going distribution statistics of data elements in each one-dimension data space is firstly monitored by a list of grid-cells called a sibling list. Once a dense grid-cell of a first-level sibling list becomes a dense unit grid-cell, new second-level sibling lists are created as its child nodes in order to trace any cluster in all possible two- dimensional rectangular subspaces. In such a way, a sibling tree grows up to the nth level at most and a l-dimensional subcluster can be found in the Kth level of the sibling tree. The proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

Original languageEnglish
Title of host publicationCIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Pages801-810
Number of pages10
DOIs
Publication statusPublished - 2007
Event16th ACM Conference on Information and Knowledge Management, CIKM 2007 - Lisboa, Portugal
Duration: 2007 Nov 62007 Nov 9

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other16th ACM Conference on Information and Knowledge Management, CIKM 2007
CountryPortugal
CityLisboa
Period07/11/607/11/9

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint Dive into the research topics of 'Grid-based subspace clustering over data streams'. Together they form a unique fingerprint.

  • Cite this

    Park, N. H., & Lee, W. S. (2007). Grid-based subspace clustering over data streams. In CIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management (pp. 801-810). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1321440.1321551