A method for obtaining rich data from PubMed using SVM

Junbum Cha, Jeongwoo Kim, Yunku Yeu, Sanghyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

Original languageEnglish
Title of host publication2016 Symposium on Applied Computing, SAC 2016
PublisherAssociation for Computing Machinery
Pages37-39
Number of pages3
ISBN (Electronic)9781450337397
DOIs
Publication statusPublished - 2016 Apr 4
Event31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy
Duration: 2016 Apr 42016 Apr 8

Publication series

NameProceedings of the ACM Symposium on Applied Computing
Volume04-08-April-2016

Other

Other31st Annual ACM Symposium on Applied Computing, SAC 2016
Country/TerritoryItaly
CityPisa
Period16/4/416/4/8

Bibliographical note

Funding Information:
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (NRF-2015R1A2A1A05001845).

Publisher Copyright:
© 2016 ACM.

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'A method for obtaining rich data from PubMed using SVM'. Together they form a unique fingerprint.

Cite this