Background: The Swanson's ABC model is powerful to infer hidden relationships buried in biological literature. However, the model is inadequate to infer relations with context information. In addition, the model generates a very large amount of candidates from biological text, and it is a semi-automatic, labor-intensive technique requiring human expert's manual input. To tackle these problems, we incorporate context terms to infer relations between AB interactions and BC interactions. Methods. We propose 3 steps to discover meaningful hidden relationships between drugs and diseases: 1) multi-level (gene, drug, disease, symptom) entity recognition, 2) interaction extraction (drug-gene, gene-disease) from literature, 3) context vector based similarity score calculation. Subsequently, we evaluate our hypothesis with the datasets of the "Alzheimer's disease" related 77,711 PubMed abstracts. As golden standards, PharmGKB and CTD databases are used. Evaluation is conducted in 2 ways: first, comparing precision of the proposed method and the previous method and second, analysing top 10 ranked results to examine whether highly ranked interactions are truly meaningful or not. Results: The results indicate that context-based relation inference achieved better precision than the previous ABC model approach. The literature analysis also shows that interactions inferred by the context-based approach are more meaningful than interactions by the previous ABC model. Conclusions: We propose a novel interaction inference technique that incorporates context term vectors into the ABC model to discover meaningful hidden relationships. By utilizing multi-level context terms, our model shows better performance than the previous ABC model.
Bibliographical noteFunding Information:
This work was supported by the National Research Foundation of Korea funded by the Korea Government, Ministry of Education, Science and Technology through the WCU (World Class University) program (R32-2008-000-10218-0). It was also supported by the KISTI (Korea Institute of Science and Technology Information). DL was also supported by LG Yonam Foundation. This article has been published as part of BMC Medical Informatics and Decision Making Volume 12 Supplement 1, 2012: Proceedings of the ACM Fifth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBio 2011). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcmedinformdecismak/ supplements/12/S1.
All Science Journal Classification (ASJC) codes
- Health Policy
- Health Informatics