Abstract
The Medical Subject Headings (MeSH) term search is typical data-gathering method in biomedical text mining. However, it has two problems: the allocation delay of the MeSH term and missing valuable literature sources. Since MeSH term allocation is performed by a human being, the allocation process has delay. In addition, even if a literature source was allocated with a MeSH term, there is a still the problem that valuable literature sources are missed during the data-gathering process. There are literature sources that are not indexed to the MeSH term of a keyword, even though it contains valuable information related to the MeSH term. The MeSH term search misses these valuable literature sources. In order to resolve these problems, we propose a novel method to gather rich data using a one-class support vector machine (SVM) and relevance rule. The term frequency-inverse document frequency (TF-IDF) and paragraph vector are examined as text vectorization methods with various parameters and relevance factors. We apply our method to lung cancer, prostate cancer, breast cancer, and Alzheimer's disease. As a result, up to 26% of keyword data and 35% of target data are gathered with high quality (a C-score of at least 0.948).
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Conference Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 4325-4331 |
Number of pages | 7 |
ISBN (Electronic) | 9781509018970 |
DOIs | |
Publication status | Published - 2017 Feb 6 |
Event | 2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Budapest, Hungary Duration: 2016 Oct 9 → 2016 Oct 12 |
Publication series
Name | 2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Conference Proceedings |
---|
Other
Other | 2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 |
---|---|
Country/Territory | Hungary |
City | Budapest |
Period | 16/10/9 → 16/10/12 |
Bibliographical note
Publisher Copyright:© 2016 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Artificial Intelligence
- Control and Optimization
- Human-Computer Interaction