Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML)

Min Song, Il Yeol Song, Xiaohua Hu

Research output: Contribution to journalArticle

2 Citations (Scopus)


Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well-known naïve Baysiean-based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.

Original languageEnglish
Pages (from-to)367-372
Number of pages6
JournalProceedings of the ASIST Annual Meeting
Publication statusPublished - 2004 Nov 1


All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Cite this