Efficient bitmap-based indexing of time-based interval sequences

Jong Won Roh, Seung Won Hwang, Byoung Kee Yi

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

In this paper, we discuss similarity searches for time series data represented as interval sequences. For instance, the time series of phone call records can be represented by time-based interval sequences, or T-interval sequences, which consist of the start and end times of the call records. To support an efficient similarity search for such sequences, we address the desirable semantics for similarity measures for the T-interval sequences, observe how existing measures fail to address such semantics, and propose a new measure that satisfies all our semantics. We then propose approximate encoding methods for T-interval sequences. More specifically, we propose two bitmap-based feature extraction methods: (1) a bin-bitmap encoding method that transforms the T-interval sequences into bitmaps of fixed length, and (2) a segmented feature extraction method that takes the longest bitmap sequences of consecutive '1' elements. Finally, we propose two query processing schemes using these bitmap-based approximate representations. We validate the efficiency and effectiveness of our proposed solutions empirically.

Original languageEnglish
Pages (from-to)38-56
Number of pages19
JournalInformation Sciences
Volume194
DOIs
Publication statusPublished - 2012 Jul 1

Fingerprint

Indexing
Semantics
Interval
Feature extraction
Time series
Query processing
Bins
Similarity Search
Feature Extraction
Encoding
Query Processing
Time Series Data
Similarity Measure
Consecutive
Transform

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Cite this

Roh, Jong Won ; Hwang, Seung Won ; Yi, Byoung Kee. / Efficient bitmap-based indexing of time-based interval sequences. In: Information Sciences. 2012 ; Vol. 194. pp. 38-56.
@article{102ccac0162849839d65568bbb0eb91c,
title = "Efficient bitmap-based indexing of time-based interval sequences",
abstract = "In this paper, we discuss similarity searches for time series data represented as interval sequences. For instance, the time series of phone call records can be represented by time-based interval sequences, or T-interval sequences, which consist of the start and end times of the call records. To support an efficient similarity search for such sequences, we address the desirable semantics for similarity measures for the T-interval sequences, observe how existing measures fail to address such semantics, and propose a new measure that satisfies all our semantics. We then propose approximate encoding methods for T-interval sequences. More specifically, we propose two bitmap-based feature extraction methods: (1) a bin-bitmap encoding method that transforms the T-interval sequences into bitmaps of fixed length, and (2) a segmented feature extraction method that takes the longest bitmap sequences of consecutive '1' elements. Finally, we propose two query processing schemes using these bitmap-based approximate representations. We validate the efficiency and effectiveness of our proposed solutions empirically.",
author = "Roh, {Jong Won} and Hwang, {Seung Won} and Yi, {Byoung Kee}",
year = "2012",
month = "7",
day = "1",
doi = "10.1016/j.ins.2011.08.013",
language = "English",
volume = "194",
pages = "38--56",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",

}

Efficient bitmap-based indexing of time-based interval sequences. / Roh, Jong Won; Hwang, Seung Won; Yi, Byoung Kee.

In: Information Sciences, Vol. 194, 01.07.2012, p. 38-56.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Efficient bitmap-based indexing of time-based interval sequences

AU - Roh, Jong Won

AU - Hwang, Seung Won

AU - Yi, Byoung Kee

PY - 2012/7/1

Y1 - 2012/7/1

N2 - In this paper, we discuss similarity searches for time series data represented as interval sequences. For instance, the time series of phone call records can be represented by time-based interval sequences, or T-interval sequences, which consist of the start and end times of the call records. To support an efficient similarity search for such sequences, we address the desirable semantics for similarity measures for the T-interval sequences, observe how existing measures fail to address such semantics, and propose a new measure that satisfies all our semantics. We then propose approximate encoding methods for T-interval sequences. More specifically, we propose two bitmap-based feature extraction methods: (1) a bin-bitmap encoding method that transforms the T-interval sequences into bitmaps of fixed length, and (2) a segmented feature extraction method that takes the longest bitmap sequences of consecutive '1' elements. Finally, we propose two query processing schemes using these bitmap-based approximate representations. We validate the efficiency and effectiveness of our proposed solutions empirically.

AB - In this paper, we discuss similarity searches for time series data represented as interval sequences. For instance, the time series of phone call records can be represented by time-based interval sequences, or T-interval sequences, which consist of the start and end times of the call records. To support an efficient similarity search for such sequences, we address the desirable semantics for similarity measures for the T-interval sequences, observe how existing measures fail to address such semantics, and propose a new measure that satisfies all our semantics. We then propose approximate encoding methods for T-interval sequences. More specifically, we propose two bitmap-based feature extraction methods: (1) a bin-bitmap encoding method that transforms the T-interval sequences into bitmaps of fixed length, and (2) a segmented feature extraction method that takes the longest bitmap sequences of consecutive '1' elements. Finally, we propose two query processing schemes using these bitmap-based approximate representations. We validate the efficiency and effectiveness of our proposed solutions empirically.

UR - http://www.scopus.com/inward/record.url?scp=84859158859&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859158859&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2011.08.013

DO - 10.1016/j.ins.2011.08.013

M3 - Article

VL - 194

SP - 38

EP - 56

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -