TY - GEN
T1 - SSL
T2 - 2017 IEEE International Conference on Big Data and Smart Computing, BigComp 2017
AU - Kim, Jeongwoo
AU - Choi, Won Gi
AU - Kim, Jungrim
AU - Park, Sanghyun
N1 - Publisher Copyright:
© 2017 IEEE.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2017/3/17
Y1 - 2017/3/17
N2 - Text mining is widely applied in biology to infer relationships between biological entities. In biology, disease-gene relationships are important to discover the cause of disease. Therefore, we propose a useful method called SSL, which infers disease-related genes, using sentence structure and literature data. Using sentence structure, the proposed method decreases the number of candidate disease-related genes and infers more meaningful disease-related genes than other comparable methods. Furthermore, our method extracts useful sentences that have information on the relationship between specific diseases and genes. By analyzing the structure of the sentences, we can obtain useful knowledge of disease-gene relationships. We applied our method to five diseases, including Alzheimer's disease, prostate cancer, gastric cancer, colorectal cancer, and lung cancer. For validation, we investigated the top 10 inferred genes for five diseases. Our method demonstrated up to 50% higher precision than existing methods, and showed 98% accuracy in inferring disease-related genes.
AB - Text mining is widely applied in biology to infer relationships between biological entities. In biology, disease-gene relationships are important to discover the cause of disease. Therefore, we propose a useful method called SSL, which infers disease-related genes, using sentence structure and literature data. Using sentence structure, the proposed method decreases the number of candidate disease-related genes and infers more meaningful disease-related genes than other comparable methods. Furthermore, our method extracts useful sentences that have information on the relationship between specific diseases and genes. By analyzing the structure of the sentences, we can obtain useful knowledge of disease-gene relationships. We applied our method to five diseases, including Alzheimer's disease, prostate cancer, gastric cancer, colorectal cancer, and lung cancer. For validation, we investigated the top 10 inferred genes for five diseases. Our method demonstrated up to 50% higher precision than existing methods, and showed 98% accuracy in inferring disease-related genes.
UR - http://www.scopus.com/inward/record.url?scp=85017586725&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85017586725&partnerID=8YFLogxK
U2 - 10.1109/BIGCOMP.2017.7881723
DO - 10.1109/BIGCOMP.2017.7881723
M3 - Conference contribution
AN - SCOPUS:85017586725
T3 - 2017 IEEE International Conference on Big Data and Smart Computing, BigComp 2017
SP - 100
EP - 107
BT - 2017 IEEE International Conference on Big Data and Smart Computing, BigComp 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 February 2017 through 16 February 2017
ER -