Prefix-querying

An approach for effective subsequence matching under time warping in sequence databases

Sang Hyun Park, Sang Wook Kim, June Suh Cho, Sriram Padmanabhan

Research output: Contribution to conferencePaper

18 Citations (Scopus)

Abstract

This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S & P 500 stock data and with very large synthetic data.

Original languageEnglish
Pages255-262
Number of pages8
Publication statusPublished - 2001 Dec 1
EventProceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management - Atlanta, GA, United States
Duration: 2001 Nov 52001 Nov 10

Other

OtherProceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management
CountryUnited States
CityAtlanta, GA
Period01/11/501/11/10

Fingerprint

Data base
Warping
Sliding window
Indexing
Query
Experiment
Query processing
Lower bounds

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Park, S. H., Kim, S. W., Cho, J. S., & Padmanabhan, S. (2001). Prefix-querying: An approach for effective subsequence matching under time warping in sequence databases. 255-262. Paper presented at Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States.
Park, Sang Hyun ; Kim, Sang Wook ; Cho, June Suh ; Padmanabhan, Sriram. / Prefix-querying : An approach for effective subsequence matching under time warping in sequence databases. Paper presented at Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States.8 p.
@conference{11519291d870417aa0f209d58b056821,
title = "Prefix-querying: An approach for effective subsequence matching under time warping in sequence databases",
abstract = "This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S & P 500 stock data and with very large synthetic data.",
author = "Park, {Sang Hyun} and Kim, {Sang Wook} and Cho, {June Suh} and Sriram Padmanabhan",
year = "2001",
month = "12",
day = "1",
language = "English",
pages = "255--262",
note = "Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management ; Conference date: 05-11-2001 Through 10-11-2001",

}

Park, SH, Kim, SW, Cho, JS & Padmanabhan, S 2001, 'Prefix-querying: An approach for effective subsequence matching under time warping in sequence databases' Paper presented at Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States, 01/11/5 - 01/11/10, pp. 255-262.

Prefix-querying : An approach for effective subsequence matching under time warping in sequence databases. / Park, Sang Hyun; Kim, Sang Wook; Cho, June Suh; Padmanabhan, Sriram.

2001. 255-262 Paper presented at Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Prefix-querying

T2 - An approach for effective subsequence matching under time warping in sequence databases

AU - Park, Sang Hyun

AU - Kim, Sang Wook

AU - Cho, June Suh

AU - Padmanabhan, Sriram

PY - 2001/12/1

Y1 - 2001/12/1

N2 - This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S & P 500 stock data and with very large synthetic data.

AB - This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S & P 500 stock data and with very large synthetic data.

UR - http://www.scopus.com/inward/record.url?scp=0035754579&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035754579&partnerID=8YFLogxK

M3 - Paper

SP - 255

EP - 262

ER -

Park SH, Kim SW, Cho JS, Padmanabhan S. Prefix-querying: An approach for effective subsequence matching under time warping in sequence databases. 2001. Paper presented at Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States.