A challenge facing nearly all biologists is to identify the complete set of genes that are important for a process or disease. This applies to scientists investigating fundamental pathways in model organisms, but also to clinicians trying to understand human disease. There are many different types of experimental data that can be used to predict the genes that are important for a process, but these data are normally dispersed across numerous publications and databases, and are of varying and unknown quality. Integrated functional gene networks aim to gather functional information from all of these data into a single intuitive graph model that can be used to predict gene functions. In this approach, the ability of each data set to predict functional associations between genes is first measured using a standard benchmark, and then the scored predictions by each data set are combined. The resulting integrated probabilistic gene network can be used by all researchers to predict gene function, with much greater coverage and accuracy than any individual data set. In this review, we discuss how such integrated gene networks are constructed, how their predictive power for gene function can be tested, and how experimental biologists can use these networks to guide their research. We pay particular attention to such networks constructed for Caenorhabditis elegans, because in this complex multicellular model system functional predictions for genes can be rapidly tested in vivo using RNAi. The approach is, however, widely applicable to any system, and might soon be a common method used to dissect the genetics of human complex diseases.
All Science Journal Classification (ASJC) codes
- Molecular Biology