A practical method for approximate subsequence search in DNA databases

Jung Im Won, Sang Kyoon Hong, Jee Hee Yoon, Sanghyun Park, Sang Wook Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we propose an accurate and efficient method for approximate subsequence search in large DNA databases. The proposed method basically adopts a binary trie as its primary structure and stores all the window subsequences extracted from a DNA sequence. For approximate subsequence search, it traverses the binary trie in a breadth-first fashion and retrieves all the matched subsequences from the traversed path within the trie by a dynamic programming technique. However, the proposed method stores only window subsequences of the pre-determined length, and thus suffers from large post-processing time in case of long query sequences. To overcome this problem, we divide a query sequence into shorter pieces, perform searching for those subsequences, and then merge their results.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 11th Pacific-Asia Conference, PAKDD 2007, Proceedings
PublisherSpringer Verlag
Pages921-931
Number of pages11
ISBN (Print)9783540717003
DOIs
Publication statusPublished - 2007
Event11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007 - Nanjing, China
Duration: 2007 May 222007 May 25

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4426 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007
CountryChina
CityNanjing
Period07/5/2207/5/25

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A practical method for approximate subsequence search in DNA databases'. Together they form a unique fingerprint.

  • Cite this

    Won, J. I., Hong, S. K., Yoon, J. H., Park, S., & Kim, S. W. (2007). A practical method for approximate subsequence search in DNA databases. In Advances in Knowledge Discovery and Data Mining - 11th Pacific-Asia Conference, PAKDD 2007, Proceedings (pp. 921-931). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4426 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-540-71701-0_103