Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML)

Min Song, Il Yeol Song, Xiaohua Hu

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well-known naïve Baysiean-based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.

Original languageEnglish
Pages (from-to)367-372
Number of pages6
JournalProceedings of the ASIST Annual Meeting
Volume41
DOIs
Publication statusPublished - 2004 Nov 1

Fingerprint

Unified Modeling Language
HTML
Processing
XML
Data mining
Experiments
Systems analysis
Internet
formalization
experiment
model analysis
available information
Testing
language
performance

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Cite this

@article{07e79d8771c74e398adc148b837c0e25,
title = "Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML)",
abstract = "Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well-known na{\"i}ve Baysiean-based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.",
author = "Min Song and Song, {Il Yeol} and Xiaohua Hu",
year = "2004",
month = "11",
day = "1",
doi = "10.1002/meet.1450410143",
language = "English",
volume = "41",
pages = "367--372",
journal = "Proceedings of the ASIST Annual Meeting",
issn = "1550-8390",
publisher = "Learned Information",

}

Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML). / Song, Min; Song, Il Yeol; Hu, Xiaohua.

In: Proceedings of the ASIST Annual Meeting, Vol. 41, 01.11.2004, p. 367-372.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML)

AU - Song, Min

AU - Song, Il Yeol

AU - Hu, Xiaohua

PY - 2004/11/1

Y1 - 2004/11/1

N2 - Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well-known naïve Baysiean-based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.

AB - Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well-known naïve Baysiean-based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.

UR - http://www.scopus.com/inward/record.url?scp=34247225709&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34247225709&partnerID=8YFLogxK

U2 - 10.1002/meet.1450410143

DO - 10.1002/meet.1450410143

M3 - Article

VL - 41

SP - 367

EP - 372

JO - Proceedings of the ASIST Annual Meeting

JF - Proceedings of the ASIST Annual Meeting

SN - 1550-8390

ER -