A method for obtaining rich data from PubMed using SVM

Junbum Cha, Jeongwoo Kim, Yunku Yeu, Sang Hyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

Original languageEnglish
Title of host publication2016 Symposium on Applied Computing, SAC 2016
PublisherAssociation for Computing Machinery
Pages37-39
Number of pages3
ISBN (Electronic)9781450337397
DOIs
Publication statusPublished - 2016 Apr 4
Event31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy
Duration: 2016 Apr 42016 Apr 8

Publication series

NameProceedings of the ACM Symposium on Applied Computing
Volume04-08-April-2016

Other

Other31st Annual ACM Symposium on Applied Computing, SAC 2016
CountryItaly
CityPisa
Period16/4/416/4/8

Fingerprint

Support vector machines
Data mining

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Cha, J., Kim, J., Yeu, Y., & Park, S. H. (2016). A method for obtaining rich data from PubMed using SVM. In 2016 Symposium on Applied Computing, SAC 2016 (pp. 37-39). (Proceedings of the ACM Symposium on Applied Computing; Vol. 04-08-April-2016). Association for Computing Machinery. https://doi.org/10.1145/2851613.2851866
Cha, Junbum ; Kim, Jeongwoo ; Yeu, Yunku ; Park, Sang Hyun. / A method for obtaining rich data from PubMed using SVM. 2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery, 2016. pp. 37-39 (Proceedings of the ACM Symposium on Applied Computing).
@inproceedings{899141ae231c42c98980a837fa17dc10,
title = "A method for obtaining rich data from PubMed using SVM",
abstract = "As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40{\%} using additional extracted data.",
author = "Junbum Cha and Jeongwoo Kim and Yunku Yeu and Park, {Sang Hyun}",
year = "2016",
month = "4",
day = "4",
doi = "10.1145/2851613.2851866",
language = "English",
series = "Proceedings of the ACM Symposium on Applied Computing",
publisher = "Association for Computing Machinery",
pages = "37--39",
booktitle = "2016 Symposium on Applied Computing, SAC 2016",

}

Cha, J, Kim, J, Yeu, Y & Park, SH 2016, A method for obtaining rich data from PubMed using SVM. in 2016 Symposium on Applied Computing, SAC 2016. Proceedings of the ACM Symposium on Applied Computing, vol. 04-08-April-2016, Association for Computing Machinery, pp. 37-39, 31st Annual ACM Symposium on Applied Computing, SAC 2016, Pisa, Italy, 16/4/4. https://doi.org/10.1145/2851613.2851866

A method for obtaining rich data from PubMed using SVM. / Cha, Junbum; Kim, Jeongwoo; Yeu, Yunku; Park, Sang Hyun.

2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery, 2016. p. 37-39 (Proceedings of the ACM Symposium on Applied Computing; Vol. 04-08-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A method for obtaining rich data from PubMed using SVM

AU - Cha, Junbum

AU - Kim, Jeongwoo

AU - Yeu, Yunku

AU - Park, Sang Hyun

PY - 2016/4/4

Y1 - 2016/4/4

N2 - As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

AB - As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

UR - http://www.scopus.com/inward/record.url?scp=84975789761&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975789761&partnerID=8YFLogxK

U2 - 10.1145/2851613.2851866

DO - 10.1145/2851613.2851866

M3 - Conference contribution

T3 - Proceedings of the ACM Symposium on Applied Computing

SP - 37

EP - 39

BT - 2016 Symposium on Applied Computing, SAC 2016

PB - Association for Computing Machinery

ER -

Cha J, Kim J, Yeu Y, Park SH. A method for obtaining rich data from PubMed using SVM. In 2016 Symposium on Applied Computing, SAC 2016. Association for Computing Machinery. 2016. p. 37-39. (Proceedings of the ACM Symposium on Applied Computing). https://doi.org/10.1145/2851613.2851866