Text mining is widely used to infer relationships between biological entities. Most text-mining algorithms utilize a cooccurrence-based approach. The term co-occurrence denotes a relationship between two interesting entities if they appear in the same sentence. Using these approaches current studies have extracted relationships between biological entities such as disease-gene relationships. However, these approaches cannot provide specific information for inferred relationships such as the role of the gene in the disease. To overcome this limitation, we propose a novel approach for inferring disease-gene relationship that provides specific knowledge of the inferred relationships. To implement this method, we first built terms based on text analysis to extract opinion sentences that include disease-gene relationships. We then extracted these opinion sentences and inferred disease-gene relationships by using disease-related and gene-related terms in the opinion sentences. Using these extracted relationships and terms, we inferred disease-related genes and constructed a disease-specific gene network. To validate our approach, we investigated the top k (k = 20) inferred genes for prostate cancer and analyzed the constructed gene network using three network analysis measures. Our approach found more disease-gene relationships than comparable method, and inferred describable disease-gene relationships.
|Title of host publication||2016 Symposium on Applied Computing, SAC 2016|
|Publisher||Association for Computing Machinery|
|Number of pages||8|
|Publication status||Published - 2016 Apr 4|
|Event||31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy|
Duration: 2016 Apr 4 → 2016 Apr 8
|Name||Proceedings of the ACM Symposium on Applied Computing|
|Other||31st Annual ACM Symposium on Applied Computing, SAC 2016|
|Period||16/4/4 → 16/4/8|
Bibliographical noteFunding Information:
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (NRF-2015R1A2A1A05001845).
All Science Journal Classification (ASJC) codes