On the linear number of matching substrings

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

We study the number of matching substrings in the pattern matching problem. In general, there can be a quadratic number of matching substrings in the size of a given text. The linearizing restriction enables to find at most a linear number of matching substrings. We first explore two well-known linearizing restriction rules, the longest-match rule and the shortest-match substring search rule, and show that both rules give the same result when a pattern is an infix-free set even though they have different semantics. Then, we introduce a new linearizing restriction, the leftmost non-overlapping match rule that is suitable for find-and-replace operations in text searching, and propose an efficient algorithm for the new rule when a pattern is described by a regular expression. We also examine the problem of obtaining the maximal number of non-overlapping matching substrings.

Original languageEnglish
Pages (from-to)715-728
Number of pages14
JournalJournal of Universal Computer Science
Volume16
Issue number5
Publication statusPublished - 2010 May 21

Fingerprint

Pattern matching
Semantics
Restriction
Regular Expressions
Pattern Matching
Matching Problem
Efficient Algorithms

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{115c54bbc3044a1592b97dabcd9b1a97,
title = "On the linear number of matching substrings",
abstract = "We study the number of matching substrings in the pattern matching problem. In general, there can be a quadratic number of matching substrings in the size of a given text. The linearizing restriction enables to find at most a linear number of matching substrings. We first explore two well-known linearizing restriction rules, the longest-match rule and the shortest-match substring search rule, and show that both rules give the same result when a pattern is an infix-free set even though they have different semantics. Then, we introduce a new linearizing restriction, the leftmost non-overlapping match rule that is suitable for find-and-replace operations in text searching, and propose an efficient algorithm for the new rule when a pattern is described by a regular expression. We also examine the problem of obtaining the maximal number of non-overlapping matching substrings.",
author = "Han, {Yo Sub}",
year = "2010",
month = "5",
day = "21",
language = "English",
volume = "16",
pages = "715--728",
journal = "Journal of Universal Computer Science",
issn = "0948-695X",
publisher = "Springer Verlag",
number = "5",

}

On the linear number of matching substrings. / Han, Yo Sub.

In: Journal of Universal Computer Science, Vol. 16, No. 5, 21.05.2010, p. 715-728.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On the linear number of matching substrings

AU - Han, Yo Sub

PY - 2010/5/21

Y1 - 2010/5/21

N2 - We study the number of matching substrings in the pattern matching problem. In general, there can be a quadratic number of matching substrings in the size of a given text. The linearizing restriction enables to find at most a linear number of matching substrings. We first explore two well-known linearizing restriction rules, the longest-match rule and the shortest-match substring search rule, and show that both rules give the same result when a pattern is an infix-free set even though they have different semantics. Then, we introduce a new linearizing restriction, the leftmost non-overlapping match rule that is suitable for find-and-replace operations in text searching, and propose an efficient algorithm for the new rule when a pattern is described by a regular expression. We also examine the problem of obtaining the maximal number of non-overlapping matching substrings.

AB - We study the number of matching substrings in the pattern matching problem. In general, there can be a quadratic number of matching substrings in the size of a given text. The linearizing restriction enables to find at most a linear number of matching substrings. We first explore two well-known linearizing restriction rules, the longest-match rule and the shortest-match substring search rule, and show that both rules give the same result when a pattern is an infix-free set even though they have different semantics. Then, we introduce a new linearizing restriction, the leftmost non-overlapping match rule that is suitable for find-and-replace operations in text searching, and propose an efficient algorithm for the new rule when a pattern is described by a regular expression. We also examine the problem of obtaining the maximal number of non-overlapping matching substrings.

UR - http://www.scopus.com/inward/record.url?scp=77952362189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952362189&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:77952362189

VL - 16

SP - 715

EP - 728

JO - Journal of Universal Computer Science

JF - Journal of Universal Computer Science

SN - 0948-695X

IS - 5

ER -