TY - GEN
T1 - estMax
T2 - 7th IEEE International Conference on Data Mining, ICDM 2007
AU - Woo, Ho Jin
AU - Lee, Won Suk
PY - 2007
Y1 - 2007
N2 - In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.
AB - In general, the number of frequent itemsets in a data set is very large. In order to represent them in more compact notation, closed or maximal frequent itemsets (MFIs) are used. However, the characteristics of a data stream make such a task be more difficult. For this purpose, this paper proposes a method called estMax that can trace the set of MFIs over a data stream. The proposed method maintains the set of frequent itemsets by a prefix tree and extracts all of MFIs without any additional superset/subset checking mechanism. Upon processing a newly generated transaction, its longest matched frequent itemsets are marked in a prefix tree as candidates for MFIs. At the same time, if any subset of these newly marked itemsets has been already marked as a candidate MFI, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.
UR - http://www.scopus.com/inward/record.url?scp=49749089668&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=49749089668&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2007.70
DO - 10.1109/ICDM.2007.70
M3 - Conference contribution
AN - SCOPUS:49749089668
SN - 0769530184
SN - 9780769530185
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 709
EP - 714
BT - Proceedings of the 7th IEEE International Conference on Data Mining, ICDM 2007
Y2 - 28 October 2007 through 31 October 2007
ER -