Prioritizing candidate disease genes by network-based boosting of genome-wide association data

Insuk Lee, U. Martin Blom, Peggy I. Wang, Jung Eun Shim, Edward M. Marcotte

Research output: Contribution to journalArticle

388 Citations (Scopus)

Abstract

Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

Original languageEnglish
Pages (from-to)1109-1121
Number of pages13
JournalGenome Research
Volume21
Issue number7
DOIs
Publication statusPublished - 2011 Jul 1

Fingerprint

Gene Regulatory Networks
Genome
Guilt
Genome-Wide Association Study
Genes
Crohn Disease
Type 2 Diabetes Mellitus
Uncertainty
Meta-Analysis
Seeds
Phenotype
Mutation

All Science Journal Classification (ASJC) codes

  • Genetics
  • Genetics(clinical)

Cite this

Lee, Insuk ; Blom, U. Martin ; Wang, Peggy I. ; Shim, Jung Eun ; Marcotte, Edward M. / Prioritizing candidate disease genes by network-based boosting of genome-wide association data. In: Genome Research. 2011 ; Vol. 21, No. 7. pp. 1109-1121.
@article{ef885eb0da314fc1a692f9f294c20cf8,
title = "Prioritizing candidate disease genes by network-based boosting of genome-wide association data",
abstract = "Network {"}guilt by association{"} (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.",
author = "Insuk Lee and Blom, {U. Martin} and Wang, {Peggy I.} and Shim, {Jung Eun} and Marcotte, {Edward M.}",
year = "2011",
month = "7",
day = "1",
doi = "10.1101/gr.118992.110",
language = "English",
volume = "21",
pages = "1109--1121",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "7",

}

Prioritizing candidate disease genes by network-based boosting of genome-wide association data. / Lee, Insuk; Blom, U. Martin; Wang, Peggy I.; Shim, Jung Eun; Marcotte, Edward M.

In: Genome Research, Vol. 21, No. 7, 01.07.2011, p. 1109-1121.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Prioritizing candidate disease genes by network-based boosting of genome-wide association data

AU - Lee, Insuk

AU - Blom, U. Martin

AU - Wang, Peggy I.

AU - Shim, Jung Eun

AU - Marcotte, Edward M.

PY - 2011/7/1

Y1 - 2011/7/1

N2 - Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

AB - Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

UR - http://www.scopus.com/inward/record.url?scp=79959898376&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959898376&partnerID=8YFLogxK

U2 - 10.1101/gr.118992.110

DO - 10.1101/gr.118992.110

M3 - Article

C2 - 21536720

AN - SCOPUS:79959898376

VL - 21

SP - 1109

EP - 1121

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 7

ER -