Updated: May 18
January 24, 2015. We recently extended our network, cluster, and visualization tools to process thousands of genes into a single plot. To explore the utility of the tools, we generated large-scale network plots of disease-associated genes for five complex human disorders: Rheumatoid Arthritis (173 genes), Schizophrenia (325 genes), Intellectual Disability (351 genes), Autism Spectrum Disorder (338 genes) and Cardiovascular Disease (213 genes). In each case, the mathematically derived networks recapitulated sets of previously reported enriched genes. However, the plots also identified additional gene networks that may point to distinct disease associated biology. Stay tuned for detailed analysis of selected network clusters in future posts.
The prolific generation of disease/gene association data provides an opportunity to rapidly advance our understanding of complex diseases, subsequently enabling the development of targeted therapeutic interventions.
Of the thousands of studies now completed, we chose five disorders in which multiple studies have been conducted and analyzed, often in aggregate as meta-analyses. Such approaches surface hundreds of genes or loci, where the link to the trait is based on a statistical linkage (either Genome Wide Association Study, GWAS; or an identified genomic alteration). This approach typically provides little information on the biology of the association, especially in GWAS, where it can difficult to pinpoint the responsible gene in an identified locus.
Using correlation statistics, a large dataset of diverse gene expression profiles and sequence variants, as well as curated biological networks, we looked for biologically relevant clusters within the reported lists of disease-linked genes.
Biologically Defined Networks
To demonstrate the network plots using known biology, figure 1 shows a plot representing 32 biologically annotated networks, each represented by 10 genes. The degree of expression correlation among each of the 320 genes in the plot is indicated by color—note that each of the 10 genes of each network exhibit a high degree of correlation to one another, appearing as a rectangle along the diagonal. Most networks exhibit little cross network correlation, whereas some (e.g. hypoxia response and endothelial cells) exhibit cross network linkage consistent with their related biological functions. Of note, proteasome, SNRPB linked splicing and proliferation exhibit linkage—we are not aware of reports linking these biological processes.
A complex disease with relatively well-characterized biological underpinnings, many studies have noted strong enrichment of genes associated with immune function and immune cell activation. Plotting 173 genes from a recent analysis is consistent with previously identified biology (figure 2).
Recent GWAS and genetic analysis has identified strong enrichment for neuronal development and synaptic signaling among a large set of genes linked to schizophrenia. Plotting 325 genes (figure 3) from a recent GWAS meta-analysis revealed several neuronal network clusters among the larger gene set.
A complex developmental disorder with difficult diagnostic criteria, a wealth genomic and genetic studies have identified strong enrichment for neuronal (and chromatin modifier) genes. A plot of 338 genes curated from SFARI.org shows enrichment for neuronal development and function genes (figure 4).
A leading cause of death in the developed world, cardiovascular disease has both genetic and environmental components. Here, noting common loci among diverse cardiovascular phenotypes, we examined a set of GWAS studies examining: electrocardiographic phenotypes (see blog entry August 2014), Sudden Cardiac death, Blood Pressure, Coronary Artery Disease GWAS, atrial fibrillation and QRS/QT duration. The resulting 215 genes are listed here.
Plotting the 215 genes (figure 5) from the results of the diverse studies revealed several previously unreported network clusters. Further exploration of the biology associated with the networks may provide new insight into the processes linked to cardiovascular disease.