Extraction of key-phrases from biomedical full-text with supervised learning techniques

Yanliang Qi, Artun I. Yagci, Min Song

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Key-phrase extraction plays useful a role in the research area of Information Systems (IS) such as digital libraries. Short metadata like key phrases could be beneficial for searchers to understand the concepts of documents' concept. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Naïve Bayes, linear regression, SVMs (reg1/2), all of which could be embedded inside an IS for document search. We use these techniques to extract key phrases from PubMed. We evaluate the performance of these systems using the well-established holdout validation method. The contributions of the paper are comparison among different classifier techniques, and a comparison of performance differences between full-text and abstract. We conducted experiments and found that SVMreg-1 improves the performance of key-phrase extraction from full-text while Naïve Bayes improves from the abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.

Original languageEnglish
Title of host publication15th Americas Conference on Information Systems 2009, AMCIS 2009
Pages2992-3000
Number of pages9
Publication statusPublished - 2009 Dec 1
Event15th Americas Conference on Information Systems 2009, AMCIS 2009 - San Francisco, CA, United States
Duration: 2009 Aug 62009 Aug 9

Publication series

Name15th Americas Conference on Information Systems 2009, AMCIS 2009
Volume5

Other

Other15th Americas Conference on Information Systems 2009, AMCIS 2009
CountryUnited States
CitySan Francisco, CA
Period09/8/609/8/9

Fingerprint

Supervised learning
information system
Information systems
learning
performance
Digital libraries
Metadata
Linear regression
functionality
Classifiers
regression
experiment
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Computer Networks and Communications
  • Information Systems
  • Library and Information Sciences

Cite this

Qi, Y., Yagci, A. I., & Song, M. (2009). Extraction of key-phrases from biomedical full-text with supervised learning techniques. In 15th Americas Conference on Information Systems 2009, AMCIS 2009 (pp. 2992-3000). (15th Americas Conference on Information Systems 2009, AMCIS 2009; Vol. 5).
Qi, Yanliang ; Yagci, Artun I. ; Song, Min. / Extraction of key-phrases from biomedical full-text with supervised learning techniques. 15th Americas Conference on Information Systems 2009, AMCIS 2009. 2009. pp. 2992-3000 (15th Americas Conference on Information Systems 2009, AMCIS 2009).
@inproceedings{b194ba6a30f64fc6b98d7b5d5a00b0d1,
title = "Extraction of key-phrases from biomedical full-text with supervised learning techniques",
abstract = "Key-phrase extraction plays useful a role in the research area of Information Systems (IS) such as digital libraries. Short metadata like key phrases could be beneficial for searchers to understand the concepts of documents' concept. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Na{\"i}ve Bayes, linear regression, SVMs (reg1/2), all of which could be embedded inside an IS for document search. We use these techniques to extract key phrases from PubMed. We evaluate the performance of these systems using the well-established holdout validation method. The contributions of the paper are comparison among different classifier techniques, and a comparison of performance differences between full-text and abstract. We conducted experiments and found that SVMreg-1 improves the performance of key-phrase extraction from full-text while Na{\"i}ve Bayes improves from the abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.",
author = "Yanliang Qi and Yagci, {Artun I.} and Min Song",
year = "2009",
month = "12",
day = "1",
language = "English",
isbn = "9781615675814",
series = "15th Americas Conference on Information Systems 2009, AMCIS 2009",
pages = "2992--3000",
booktitle = "15th Americas Conference on Information Systems 2009, AMCIS 2009",

}

Qi, Y, Yagci, AI & Song, M 2009, Extraction of key-phrases from biomedical full-text with supervised learning techniques. in 15th Americas Conference on Information Systems 2009, AMCIS 2009. 15th Americas Conference on Information Systems 2009, AMCIS 2009, vol. 5, pp. 2992-3000, 15th Americas Conference on Information Systems 2009, AMCIS 2009, San Francisco, CA, United States, 09/8/6.

Extraction of key-phrases from biomedical full-text with supervised learning techniques. / Qi, Yanliang; Yagci, Artun I.; Song, Min.

15th Americas Conference on Information Systems 2009, AMCIS 2009. 2009. p. 2992-3000 (15th Americas Conference on Information Systems 2009, AMCIS 2009; Vol. 5).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Extraction of key-phrases from biomedical full-text with supervised learning techniques

AU - Qi, Yanliang

AU - Yagci, Artun I.

AU - Song, Min

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Key-phrase extraction plays useful a role in the research area of Information Systems (IS) such as digital libraries. Short metadata like key phrases could be beneficial for searchers to understand the concepts of documents' concept. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Naïve Bayes, linear regression, SVMs (reg1/2), all of which could be embedded inside an IS for document search. We use these techniques to extract key phrases from PubMed. We evaluate the performance of these systems using the well-established holdout validation method. The contributions of the paper are comparison among different classifier techniques, and a comparison of performance differences between full-text and abstract. We conducted experiments and found that SVMreg-1 improves the performance of key-phrase extraction from full-text while Naïve Bayes improves from the abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.

AB - Key-phrase extraction plays useful a role in the research area of Information Systems (IS) such as digital libraries. Short metadata like key phrases could be beneficial for searchers to understand the concepts of documents' concept. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Naïve Bayes, linear regression, SVMs (reg1/2), all of which could be embedded inside an IS for document search. We use these techniques to extract key phrases from PubMed. We evaluate the performance of these systems using the well-established holdout validation method. The contributions of the paper are comparison among different classifier techniques, and a comparison of performance differences between full-text and abstract. We conducted experiments and found that SVMreg-1 improves the performance of key-phrase extraction from full-text while Naïve Bayes improves from the abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.

UR - http://www.scopus.com/inward/record.url?scp=79954623508&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79954623508&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:79954623508

SN - 9781615675814

T3 - 15th Americas Conference on Information Systems 2009, AMCIS 2009

SP - 2992

EP - 3000

BT - 15th Americas Conference on Information Systems 2009, AMCIS 2009

ER -

Qi Y, Yagci AI, Song M. Extraction of key-phrases from biomedical full-text with supervised learning techniques. In 15th Americas Conference on Information Systems 2009, AMCIS 2009. 2009. p. 2992-3000. (15th Americas Conference on Information Systems 2009, AMCIS 2009).