SBASS: Segment based approach for subsequence searches in sequence databases

Sanghyun Park, S. W. Kim, W. W. Chu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searching Technique (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. That is, the SBASS divides a data sequence X and a query sequence q into piece-wise segments and compares q with only those subsequences which consist of n consecutive segments of X . Here, n is the number of segments in q. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their monotonically changing properties, and build a multi-dimensional index. Using this index, queries are processed with four steps: (1) index filtering, (2) feature filtering, (3) successor filtering, and (4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

Original languageEnglish
Pages (from-to)37-46
Number of pages10
JournalComputer Systems Science and Engineering
Volume22
Issue number1-2
Publication statusPublished - 2007 Jan 1

Fingerprint

Subsequence
Time Warping
Filtering
Query
Processing
Synthetic Data
Feature Vector
Similarity Measure
Post-processing
Divides
Experiments
Consecutive
Resolve
Retrieval
Experiment

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{80a8e3c3d88b45c9ad5cc43318b95a09,
title = "SBASS: Segment based approach for subsequence searches in sequence databases",
abstract = "This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searching Technique (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. That is, the SBASS divides a data sequence X→ and a query sequence q→ into piece-wise segments and compares q with only those subsequences which consist of n consecutive segments of X →. Here, n is the number of segments in q→. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their monotonically changing properties, and build a multi-dimensional index. Using this index, queries are processed with four steps: (1) index filtering, (2) feature filtering, (3) successor filtering, and (4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.",
author = "Sanghyun Park and Kim, {S. W.} and Chu, {W. W.}",
year = "2007",
month = "1",
day = "1",
language = "English",
volume = "22",
pages = "37--46",
journal = "Computer Systems Science and Engineering",
issn = "0267-6192",
publisher = "CRL Publishing",
number = "1-2",

}

SBASS : Segment based approach for subsequence searches in sequence databases. / Park, Sanghyun; Kim, S. W.; Chu, W. W.

In: Computer Systems Science and Engineering, Vol. 22, No. 1-2, 01.01.2007, p. 37-46.

Research output: Contribution to journalArticle

TY - JOUR

T1 - SBASS

T2 - Segment based approach for subsequence searches in sequence databases

AU - Park, Sanghyun

AU - Kim, S. W.

AU - Chu, W. W.

PY - 2007/1/1

Y1 - 2007/1/1

N2 - This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searching Technique (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. That is, the SBASS divides a data sequence X→ and a query sequence q→ into piece-wise segments and compares q with only those subsequences which consist of n consecutive segments of X →. Here, n is the number of segments in q→. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their monotonically changing properties, and build a multi-dimensional index. Using this index, queries are processed with four steps: (1) index filtering, (2) feature filtering, (3) successor filtering, and (4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

AB - This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searching Technique (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. That is, the SBASS divides a data sequence X→ and a query sequence q→ into piece-wise segments and compares q with only those subsequences which consist of n consecutive segments of X →. Here, n is the number of segments in q→. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their monotonically changing properties, and build a multi-dimensional index. Using this index, queries are processed with four steps: (1) index filtering, (2) feature filtering, (3) successor filtering, and (4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

UR - http://www.scopus.com/inward/record.url?scp=34250822874&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34250822874&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:34250822874

VL - 22

SP - 37

EP - 46

JO - Computer Systems Science and Engineering

JF - Computer Systems Science and Engineering

SN - 0267-6192

IS - 1-2

ER -