OMPPM

Online multiple palindrome pattern matching

Hwee Kim, Yo-Sub Han

Research output: Contribution to journalArticle

Abstract

Motivation: A palindrome is a string that reads the same forward and backward. Finding palindromic substructures is important in DNA, RNA or protein sequence analysis. We say that two strings of the same length are pal-equivalent if, for each possible centre, they have the same length of the maximal palindrome. Given a text T of length n and a pattern P of length m, we study the palindrome pattern matching problem that finds all indices i such that P and T[i-m+1:i] are pal-equivalent. Results: We first solve the online palindrome pattern matching problem in O(m2) preprocessing time and O(mn) query time using O(m2) space. We then extend the problem for multiple patterns and solve the online multiple palindrome pattern matching problem in O(mkM) preprocessing time and O(mkn+c) query time using O(mkM) space, where M is the sum of all pattern lengths, mk is the longest pattern length and c is the number of pattern occurrences.

Original languageEnglish
Pages (from-to)1151-1157
Number of pages7
JournalBioinformatics
Volume32
Issue number8
DOIs
Publication statusPublished - 2016 Apr 15

Fingerprint

Palindrome
Pattern matching
Pattern Matching
Matching Problem
RNA Sequence Analysis
RNA
Preprocessing
Protein Sequence Analysis
DNA
Strings
DNA Sequence Analysis
Query
Proteins
Sequence Analysis
Protein Sequence
Substructure

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Kim, Hwee ; Han, Yo-Sub. / OMPPM : Online multiple palindrome pattern matching. In: Bioinformatics. 2016 ; Vol. 32, No. 8. pp. 1151-1157.
@article{9967b1059c1b474580ee65c5acba3af3,
title = "OMPPM: Online multiple palindrome pattern matching",
abstract = "Motivation: A palindrome is a string that reads the same forward and backward. Finding palindromic substructures is important in DNA, RNA or protein sequence analysis. We say that two strings of the same length are pal-equivalent if, for each possible centre, they have the same length of the maximal palindrome. Given a text T of length n and a pattern P of length m, we study the palindrome pattern matching problem that finds all indices i such that P and T[i-m+1:i] are pal-equivalent. Results: We first solve the online palindrome pattern matching problem in O(m2) preprocessing time and O(mn) query time using O(m2) space. We then extend the problem for multiple patterns and solve the online multiple palindrome pattern matching problem in O(mkM) preprocessing time and O(mkn+c) query time using O(mkM) space, where M is the sum of all pattern lengths, mk is the longest pattern length and c is the number of pattern occurrences.",
author = "Hwee Kim and Yo-Sub Han",
year = "2016",
month = "4",
day = "15",
doi = "10.1093/bioinformatics/btv738",
language = "English",
volume = "32",
pages = "1151--1157",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "8",

}

OMPPM : Online multiple palindrome pattern matching. / Kim, Hwee; Han, Yo-Sub.

In: Bioinformatics, Vol. 32, No. 8, 15.04.2016, p. 1151-1157.

Research output: Contribution to journalArticle

TY - JOUR

T1 - OMPPM

T2 - Online multiple palindrome pattern matching

AU - Kim, Hwee

AU - Han, Yo-Sub

PY - 2016/4/15

Y1 - 2016/4/15

N2 - Motivation: A palindrome is a string that reads the same forward and backward. Finding palindromic substructures is important in DNA, RNA or protein sequence analysis. We say that two strings of the same length are pal-equivalent if, for each possible centre, they have the same length of the maximal palindrome. Given a text T of length n and a pattern P of length m, we study the palindrome pattern matching problem that finds all indices i such that P and T[i-m+1:i] are pal-equivalent. Results: We first solve the online palindrome pattern matching problem in O(m2) preprocessing time and O(mn) query time using O(m2) space. We then extend the problem for multiple patterns and solve the online multiple palindrome pattern matching problem in O(mkM) preprocessing time and O(mkn+c) query time using O(mkM) space, where M is the sum of all pattern lengths, mk is the longest pattern length and c is the number of pattern occurrences.

AB - Motivation: A palindrome is a string that reads the same forward and backward. Finding palindromic substructures is important in DNA, RNA or protein sequence analysis. We say that two strings of the same length are pal-equivalent if, for each possible centre, they have the same length of the maximal palindrome. Given a text T of length n and a pattern P of length m, we study the palindrome pattern matching problem that finds all indices i such that P and T[i-m+1:i] are pal-equivalent. Results: We first solve the online palindrome pattern matching problem in O(m2) preprocessing time and O(mn) query time using O(m2) space. We then extend the problem for multiple patterns and solve the online multiple palindrome pattern matching problem in O(mkM) preprocessing time and O(mkn+c) query time using O(mkM) space, where M is the sum of all pattern lengths, mk is the longest pattern length and c is the number of pattern occurrences.

UR - http://www.scopus.com/inward/record.url?scp=84966602683&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84966602683&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btv738

DO - 10.1093/bioinformatics/btv738

M3 - Article

VL - 32

SP - 1151

EP - 1157

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 8

ER -