Clustering of XML schemas for information integration

Tae Woo Rhim, Kyong H.O. Lee

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

As a prerequisite for information integration, this paper presents an efficient method for clustering XML schemas. The proposed method first computes similarities among schemas. The similarity is defined by the size of the common structure between two schemas under the assumption that the schemas with less cost to be integrated are more similar. Specifically, we extract one-to-one matchings between paths with the largest number of corresponding elements. Finally, a hierarchical clustering method is applied to the values of similarity. Experimental results with many XML schemas show that the method has performed better compared with previous works in terms of the accuracy of clustering, the clustering rate, the quality of clustering, and the time complexity.

Original languageEnglish
Pages (from-to)3-13
Number of pages11
JournalJournal of Computer Information Systems
Volume46
Issue number2
Publication statusPublished - 2005 Dec 1

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Education
  • Computer Networks and Communications

Cite this