Finding recently frequent itemsets adaptively over online transactional data streams,

Joong Hyuk Chang, Won Suk Lee

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be changed as time goes by. Identifying the recent change of a data stream, especially for an online data stream, can provide valuable information for the analysis of the data stream. However, most of mining algorithms or frequency approximation algorithms over a data stream do not differentiate the information of recently generated data elements from the obsolete information of old data elements which may be no longer useful or possibly invalid at present. Therefore, they are not able to extract the recent change of information in a data stream adaptively. This paper proposes a data mining method for finding recently frequent itemsets adaptively over an online transactional data stream. The effect of old transactions on the current mining result of a data steam is diminished by decaying the old occurrences of each itemset as time goes by. Furthermore, several optimization techniques are devised to minimize processing time as well as memory usage. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

Original languageEnglish
Pages (from-to)849-869
Number of pages21
JournalInformation Systems
Volume31
Issue number8
DOIs
Publication statusPublished - 2006 Dec 1

Fingerprint

Approximation algorithms
Data mining
Steam
Data storage equipment
Processing
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture

Cite this

@article{e8d3e4efa59a4c74bd06df56ec1cce65,
title = "Finding recently frequent itemsets adaptively over online transactional data streams,",
abstract = "A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be changed as time goes by. Identifying the recent change of a data stream, especially for an online data stream, can provide valuable information for the analysis of the data stream. However, most of mining algorithms or frequency approximation algorithms over a data stream do not differentiate the information of recently generated data elements from the obsolete information of old data elements which may be no longer useful or possibly invalid at present. Therefore, they are not able to extract the recent change of information in a data stream adaptively. This paper proposes a data mining method for finding recently frequent itemsets adaptively over an online transactional data stream. The effect of old transactions on the current mining result of a data steam is diminished by decaying the old occurrences of each itemset as time goes by. Furthermore, several optimization techniques are devised to minimize processing time as well as memory usage. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.",
author = "Chang, {Joong Hyuk} and Lee, {Won Suk}",
year = "2006",
month = "12",
day = "1",
doi = "10.1016/j.is.2005.04.001",
language = "English",
volume = "31",
pages = "849--869",
journal = "Information Systems",
issn = "0306-4379",
publisher = "Elsevier Limited",
number = "8",

}

Finding recently frequent itemsets adaptively over online transactional data streams,. / Chang, Joong Hyuk; Lee, Won Suk.

In: Information Systems, Vol. 31, No. 8, 01.12.2006, p. 849-869.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Finding recently frequent itemsets adaptively over online transactional data streams,

AU - Chang, Joong Hyuk

AU - Lee, Won Suk

PY - 2006/12/1

Y1 - 2006/12/1

N2 - A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be changed as time goes by. Identifying the recent change of a data stream, especially for an online data stream, can provide valuable information for the analysis of the data stream. However, most of mining algorithms or frequency approximation algorithms over a data stream do not differentiate the information of recently generated data elements from the obsolete information of old data elements which may be no longer useful or possibly invalid at present. Therefore, they are not able to extract the recent change of information in a data stream adaptively. This paper proposes a data mining method for finding recently frequent itemsets adaptively over an online transactional data stream. The effect of old transactions on the current mining result of a data steam is diminished by decaying the old occurrences of each itemset as time goes by. Furthermore, several optimization techniques are devised to minimize processing time as well as memory usage. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

AB - A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be changed as time goes by. Identifying the recent change of a data stream, especially for an online data stream, can provide valuable information for the analysis of the data stream. However, most of mining algorithms or frequency approximation algorithms over a data stream do not differentiate the information of recently generated data elements from the obsolete information of old data elements which may be no longer useful or possibly invalid at present. Therefore, they are not able to extract the recent change of information in a data stream adaptively. This paper proposes a data mining method for finding recently frequent itemsets adaptively over an online transactional data stream. The effect of old transactions on the current mining result of a data steam is diminished by decaying the old occurrences of each itemset as time goes by. Furthermore, several optimization techniques are devised to minimize processing time as well as memory usage. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

UR - http://www.scopus.com/inward/record.url?scp=33748676847&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33748676847&partnerID=8YFLogxK

U2 - 10.1016/j.is.2005.04.001

DO - 10.1016/j.is.2005.04.001

M3 - Article

AN - SCOPUS:33748676847

VL - 31

SP - 849

EP - 869

JO - Information Systems

JF - Information Systems

SN - 0306-4379

IS - 8

ER -