estMax: Tracing maximal frequent itemsets over online data streams

Ho Jin Woo, Won Suk Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

Original languageEnglish
Title of host publicationProceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007
Pages709-714
Number of pages6
DOIs
Publication statusPublished - 2007 Dec 1
Event7th IEEE International Conference on Data Mining, ICDM 2007 - Omaha, NE, United States
Duration: 2007 Oct 282007 Oct 31

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other7th IEEE International Conference on Data Mining, ICDM 2007
CountryUnited States
CityOmaha, NE
Period07/10/2807/10/31

Fingerprint

Processing
Experiments

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Woo, H. J., & Lee, W. S. (2007). estMax: Tracing maximal frequent itemsets over online data streams. In Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007 (pp. 709-714). [4470315] (Proceedings - IEEE International Conference on Data Mining, ICDM). https://doi.org/10.1109/ICDM.2007.70
Woo, Ho Jin ; Lee, Won Suk. / estMax : Tracing maximal frequent itemsets over online data streams. Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007. 2007. pp. 709-714 (Proceedings - IEEE International Conference on Data Mining, ICDM).
@inproceedings{04ed5391079f452d861ca4ba92a7e524,
title = "estMax: Tracing maximal frequent itemsets over online data streams",
abstract = "In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.",
author = "Woo, {Ho Jin} and Lee, {Won Suk}",
year = "2007",
month = "12",
day = "1",
doi = "10.1109/ICDM.2007.70",
language = "English",
isbn = "0769530184",
series = "Proceedings - IEEE International Conference on Data Mining, ICDM",
pages = "709--714",
booktitle = "Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007",

}

Woo, HJ & Lee, WS 2007, estMax: Tracing maximal frequent itemsets over online data streams. in Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007., 4470315, Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 709-714, 7th IEEE International Conference on Data Mining, ICDM 2007, Omaha, NE, United States, 07/10/28. https://doi.org/10.1109/ICDM.2007.70

estMax : Tracing maximal frequent itemsets over online data streams. / Woo, Ho Jin; Lee, Won Suk.

Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007. 2007. p. 709-714 4470315 (Proceedings - IEEE International Conference on Data Mining, ICDM).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - estMax

T2 - Tracing maximal frequent itemsets over online data streams

AU - Woo, Ho Jin

AU - Lee, Won Suk

PY - 2007/12/1

Y1 - 2007/12/1

N2 - In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

AB - In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

UR - http://www.scopus.com/inward/record.url?scp=49749089668&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49749089668&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2007.70

DO - 10.1109/ICDM.2007.70

M3 - Conference contribution

AN - SCOPUS:49749089668

SN - 0769530184

SN - 9780769530185

T3 - Proceedings - IEEE International Conference on Data Mining, ICDM

SP - 709

EP - 714

BT - Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007

ER -

Woo HJ, Lee WS. estMax: Tracing maximal frequent itemsets over online data streams. In Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007. 2007. p. 709-714. 4470315. (Proceedings - IEEE International Conference on Data Mining, ICDM). https://doi.org/10.1109/ICDM.2007.70