As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.
|Title of host publication||2016 Symposium on Applied Computing, SAC 2016|
|Publisher||Association for Computing Machinery|
|Number of pages||3|
|Publication status||Published - 2016 Apr 4|
|Event||31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy|
Duration: 2016 Apr 4 → 2016 Apr 8
|Name||Proceedings of the ACM Symposium on Applied Computing|
|Other||31st Annual ACM Symposium on Applied Computing, SAC 2016|
|Period||16/4/4 → 16/4/8|
Bibliographical noteFunding Information:
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (NRF-2015R1A2A1A05001845).
All Science Journal Classification (ASJC) codes