Segment-based approach for subsequence searches in sequence databases

Sang Hyun Park, Sang Wook Kim, Wesley W. Chu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

69 Citations (Scopus)

Abstract

This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their mono-tonically changing properties, and build a multi-dimensional index such as R-tree or R∗-tree. Using this index, queries are processed with four steps: 1) index filtering, 2) feature filtering, 3) successor filtering, and 4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

Original languageEnglish
Title of host publicationProceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001
PublisherAssociation for Computing Machinery
Pages248-252
Number of pages5
ISBN (Print)1581132875, 9781581132878
DOIs
Publication statusPublished - 2001 Mar 1
Event2001 ACM Symposium on Applied Computing, SAC 2001 - Las Vegas, United States
Duration: 2001 Mar 112001 Mar 14

Publication series

NameProceedings of the ACM Symposium on Applied Computing

Other

Other2001 ACM Symposium on Applied Computing, SAC 2001
CountryUnited States
CityLas Vegas
Period01/3/1101/3/14

Fingerprint

Processing
Experiments

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Park, S. H., Kim, S. W., & Chu, W. W. (2001). Segment-based approach for subsequence searches in sequence databases. In Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001 (pp. 248-252). (Proceedings of the ACM Symposium on Applied Computing). Association for Computing Machinery. https://doi.org/10.1145/372202.372334
Park, Sang Hyun ; Kim, Sang Wook ; Chu, Wesley W. / Segment-based approach for subsequence searches in sequence databases. Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001. Association for Computing Machinery, 2001. pp. 248-252 (Proceedings of the ACM Symposium on Applied Computing).
@inproceedings{4f016f26d15347278fc801addebe111c,
title = "Segment-based approach for subsequence searches in sequence databases",
abstract = "This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their mono-tonically changing properties, and build a multi-dimensional index such as R-tree or R∗-tree. Using this index, queries are processed with four steps: 1) index filtering, 2) feature filtering, 3) successor filtering, and 4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.",
author = "Park, {Sang Hyun} and Kim, {Sang Wook} and Chu, {Wesley W.}",
year = "2001",
month = "3",
day = "1",
doi = "10.1145/372202.372334",
language = "English",
isbn = "1581132875",
series = "Proceedings of the ACM Symposium on Applied Computing",
publisher = "Association for Computing Machinery",
pages = "248--252",
booktitle = "Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001",

}

Park, SH, Kim, SW & Chu, WW 2001, Segment-based approach for subsequence searches in sequence databases. in Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001. Proceedings of the ACM Symposium on Applied Computing, Association for Computing Machinery, pp. 248-252, 2001 ACM Symposium on Applied Computing, SAC 2001, Las Vegas, United States, 01/3/11. https://doi.org/10.1145/372202.372334

Segment-based approach for subsequence searches in sequence databases. / Park, Sang Hyun; Kim, Sang Wook; Chu, Wesley W.

Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001. Association for Computing Machinery, 2001. p. 248-252 (Proceedings of the ACM Symposium on Applied Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Segment-based approach for subsequence searches in sequence databases

AU - Park, Sang Hyun

AU - Kim, Sang Wook

AU - Chu, Wesley W.

PY - 2001/3/1

Y1 - 2001/3/1

N2 - This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their mono-tonically changing properties, and build a multi-dimensional index such as R-tree or R∗-tree. Using this index, queries are processed with four steps: 1) index filtering, 2) feature filtering, 3) successor filtering, and 4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

AB - This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their mono-tonically changing properties, and build a multi-dimensional index such as R-tree or R∗-tree. Using this index, queries are processed with four steps: 1) index filtering, 2) feature filtering, 3) successor filtering, and 4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

UR - http://www.scopus.com/inward/record.url?scp=77952145570&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952145570&partnerID=8YFLogxK

U2 - 10.1145/372202.372334

DO - 10.1145/372202.372334

M3 - Conference contribution

AN - SCOPUS:77952145570

SN - 1581132875

SN - 9781581132878

T3 - Proceedings of the ACM Symposium on Applied Computing

SP - 248

EP - 252

BT - Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001

PB - Association for Computing Machinery

ER -

Park SH, Kim SW, Chu WW. Segment-based approach for subsequence searches in sequence databases. In Proceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001. Association for Computing Machinery. 2001. p. 248-252. (Proceedings of the ACM Symposium on Applied Computing). https://doi.org/10.1145/372202.372334