A literature-driven method to calculate similarities among diseases

Hyunjin Kim, Youngmi Yoon, Jaegyoon Ahn, Sang Hyun Park

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: "Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. Methods and results: The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. Conclusions: We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.

Original languageEnglish
Pages (from-to)108-122
Number of pages15
JournalComputer Methods and Programs in Biomedicine
Volume122
Issue number2
DOIs
Publication statusPublished - 2015 Nov 1

Fingerprint

Adrenergic Fibers
Genes
Fibers
Computational Biology
Research
Pharmaceutical Preparations
Names

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Health Informatics

Cite this

Kim, Hyunjin ; Yoon, Youngmi ; Ahn, Jaegyoon ; Park, Sang Hyun. / A literature-driven method to calculate similarities among diseases. In: Computer Methods and Programs in Biomedicine. 2015 ; Vol. 122, No. 2. pp. 108-122.
@article{07e1c65c037340028142f29098edd5fb,
title = "A literature-driven method to calculate similarities among diseases",
abstract = "Background: {"}Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results{"}. It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. Methods and results: The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. Conclusions: We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.",
author = "Hyunjin Kim and Youngmi Yoon and Jaegyoon Ahn and Park, {Sang Hyun}",
year = "2015",
month = "11",
day = "1",
doi = "10.1016/j.cmpb.2015.07.001",
language = "English",
volume = "122",
pages = "108--122",
journal = "Computer Methods and Programs in Biomedicine",
issn = "0169-2607",
publisher = "Elsevier Ireland Ltd",
number = "2",

}

A literature-driven method to calculate similarities among diseases. / Kim, Hyunjin; Yoon, Youngmi; Ahn, Jaegyoon; Park, Sang Hyun.

In: Computer Methods and Programs in Biomedicine, Vol. 122, No. 2, 01.11.2015, p. 108-122.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A literature-driven method to calculate similarities among diseases

AU - Kim, Hyunjin

AU - Yoon, Youngmi

AU - Ahn, Jaegyoon

AU - Park, Sang Hyun

PY - 2015/11/1

Y1 - 2015/11/1

N2 - Background: "Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. Methods and results: The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. Conclusions: We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.

AB - Background: "Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. Methods and results: The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. Conclusions: We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.

UR - http://www.scopus.com/inward/record.url?scp=84944280481&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944280481&partnerID=8YFLogxK

U2 - 10.1016/j.cmpb.2015.07.001

DO - 10.1016/j.cmpb.2015.07.001

M3 - Article

C2 - 26212477

AN - SCOPUS:84944280481

VL - 122

SP - 108

EP - 122

JO - Computer Methods and Programs in Biomedicine

JF - Computer Methods and Programs in Biomedicine

SN - 0169-2607

IS - 2

ER -