A statistical μ-partitioning method for clustering data streams

Nam Hun Park, Won Suk Lee

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to this reason, most algorithms for data streams sacrifice the correctness of their results for fast processing time. This paper proposes a clustering method over a data stream based on statistical μ-partition. The multi-dimensional space of a data domain is divided into a set of mutually exclusive equal-size initial cells. A cell maintains the distribution statistics of data elements in its range. Based on the distribution statistics of a cell, a dense cell is dynamically split into two mutually exclusive smaller cells called intermediate cells. Eventually, the dense sub-range of an initial cell is recursively partitioned until it becomes the smallest cell called a unit cell. A cluster of a data stream is a group of adjacent dense unit cells. As the size of a unit cell is set to be smaller, the resulting set of clusters is more accurately identified. Through a series of experiments, the performance of the proposed algorithm is comparatively analyzed.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsAdnan Yazici, Cevat Sener
PublisherSpringer Verlag
Pages292-299
Number of pages8
ISBN (Print)3540204091, 9783540397373
DOIs
Publication statusPublished - 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2869
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A statistical μ-partitioning method for clustering data streams'. Together they form a unique fingerprint.

Cite this