Alignment distance of regular tree languages

Yo-Sub Han, Sang Ki Ko

Research output: Contribution to journalArticle

Abstract

We consider the tree alignment distance problem between a tree and a regular tree language. The tree alignment distance is an alternative of the tree edit-distance, in which we construct an optimal alignment between two trees and compute its cost instead of directly computing the minimum-cost of tree edits. The alignment distance is crucial for understanding the structural similarity between trees. We, in particular, consider the following problem: given a tree t and a tree automaton recognizing a regular tree language L, find the most similar tree from L with respect to t under the tree alignment metric. Regular tree languages are commonly used in practice such as XML schema or bioinformatics. We propose an O(mn) time algorithm for computing the (ordered) alignment distance between t and L when the maximum degree of t and trees in L is bounded by a constant, and O(mn2) time algorithm when the maximum degree of trees in L is not bounded, where m is the size of t and n is the size of a finite tree automaton for L. We also study the case where a tree is not necessarily ordered, and show that the time complexity remains O(mn) if the maximum degree is bounded and MAX SNP-hard otherwise.

Original languageEnglish
Pages (from-to)127-137
Number of pages11
JournalTheoretical Computer Science
Volume787
DOIs
Publication statusPublished - 2019 Oct 1

Fingerprint

Alignment
Maximum Degree
Bioinformatics
XML
Tree Automata
Language
Costs
XML Schema
Edit Distance
Structural Similarity
Computing
Finite Automata
Time Complexity

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{38c46843050549d3944c5865f583bae5,
title = "Alignment distance of regular tree languages",
abstract = "We consider the tree alignment distance problem between a tree and a regular tree language. The tree alignment distance is an alternative of the tree edit-distance, in which we construct an optimal alignment between two trees and compute its cost instead of directly computing the minimum-cost of tree edits. The alignment distance is crucial for understanding the structural similarity between trees. We, in particular, consider the following problem: given a tree t and a tree automaton recognizing a regular tree language L, find the most similar tree from L with respect to t under the tree alignment metric. Regular tree languages are commonly used in practice such as XML schema or bioinformatics. We propose an O(mn) time algorithm for computing the (ordered) alignment distance between t and L when the maximum degree of t and trees in L is bounded by a constant, and O(mn2) time algorithm when the maximum degree of trees in L is not bounded, where m is the size of t and n is the size of a finite tree automaton for L. We also study the case where a tree is not necessarily ordered, and show that the time complexity remains O(mn) if the maximum degree is bounded and MAX SNP-hard otherwise.",
author = "Yo-Sub Han and Ko, {Sang Ki}",
year = "2019",
month = "10",
day = "1",
doi = "10.1016/j.tcs.2019.06.022",
language = "English",
volume = "787",
pages = "127--137",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier",

}

Alignment distance of regular tree languages. / Han, Yo-Sub; Ko, Sang Ki.

In: Theoretical Computer Science, Vol. 787, 01.10.2019, p. 127-137.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Alignment distance of regular tree languages

AU - Han, Yo-Sub

AU - Ko, Sang Ki

PY - 2019/10/1

Y1 - 2019/10/1

N2 - We consider the tree alignment distance problem between a tree and a regular tree language. The tree alignment distance is an alternative of the tree edit-distance, in which we construct an optimal alignment between two trees and compute its cost instead of directly computing the minimum-cost of tree edits. The alignment distance is crucial for understanding the structural similarity between trees. We, in particular, consider the following problem: given a tree t and a tree automaton recognizing a regular tree language L, find the most similar tree from L with respect to t under the tree alignment metric. Regular tree languages are commonly used in practice such as XML schema or bioinformatics. We propose an O(mn) time algorithm for computing the (ordered) alignment distance between t and L when the maximum degree of t and trees in L is bounded by a constant, and O(mn2) time algorithm when the maximum degree of trees in L is not bounded, where m is the size of t and n is the size of a finite tree automaton for L. We also study the case where a tree is not necessarily ordered, and show that the time complexity remains O(mn) if the maximum degree is bounded and MAX SNP-hard otherwise.

AB - We consider the tree alignment distance problem between a tree and a regular tree language. The tree alignment distance is an alternative of the tree edit-distance, in which we construct an optimal alignment between two trees and compute its cost instead of directly computing the minimum-cost of tree edits. The alignment distance is crucial for understanding the structural similarity between trees. We, in particular, consider the following problem: given a tree t and a tree automaton recognizing a regular tree language L, find the most similar tree from L with respect to t under the tree alignment metric. Regular tree languages are commonly used in practice such as XML schema or bioinformatics. We propose an O(mn) time algorithm for computing the (ordered) alignment distance between t and L when the maximum degree of t and trees in L is bounded by a constant, and O(mn2) time algorithm when the maximum degree of trees in L is not bounded, where m is the size of t and n is the size of a finite tree automaton for L. We also study the case where a tree is not necessarily ordered, and show that the time complexity remains O(mn) if the maximum degree is bounded and MAX SNP-hard otherwise.

UR - http://www.scopus.com/inward/record.url?scp=85070659879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070659879&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2019.06.022

DO - 10.1016/j.tcs.2019.06.022

M3 - Article

AN - SCOPUS:85070659879

VL - 787

SP - 127

EP - 137

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

ER -