skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory

Journal Article · · BMC Bioinformatics
 [1];  [2];  [3];  [4];  [5];  [6];  [4]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division; Clemson Univ., SC (United States). School of Computing
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division
  3. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division; Xiangtan Univ., Hunan (China). Dept. of Physics
  4. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division; Univ. of Oklahoma, Norman, OK (United States). Dept. of Botany and Microbiology. Inst. for Environmental Genomics
  5. Univ. of Texas at Dallas, Richardson, TX (United States). Dept. of Computer Science
  6. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division; Purdue Univ., West Lafayette, IN (United States). Dept. of Biological Sciences

Background: Large-scale sequencing of entire genomes has ushered in a new age in biology. One of the next grand challenges is to dissect the cellular networks consisting of many individual functional modules. Defining co-expression networks without ambiguity based on genome-wide microarray data is difficult and current methods are not robust and consistent with different data sets. This is particularly problematic for little understood organisms since not much existing biological knowledge can be exploited for determining the threshold to differentiate true correlation from random noise. Random matrix theory (RMT), which has been widely and successfully used in physics, is a powerful approach to distinguish system-specific, non-random properties embedded in complex systems from random noise. Here, we have hypothesized that the universal predictions of RMT are also applicable to biological systems and the correlation threshold can be determined by characterizing the correlation matrix of microarray profiles using random matrix theory. Results: Application of random matrix theory to microarray data of S. oneidensis, E. coli, yeast, A. thaliana, Drosophila, mouse and human indicates that there is a of nearest neighbour spacing distribution (NNSD) of correlation matrix after gradually removing certain elements insider the matrix. Testing on an in silico modular model has demonstrated that this transition can be used to determine the correlation threshold for revealing modular co-expression networks. The coexpression network derived from yeast cell cycling microarray data is supported by gene annotation. The topological properties of the resulting co-expression network agree well with the general properties of biological networks. Computational evaluations have showed that RMT approach is sensitive and robust. Furthermore, evaluation on sampled expression data of an in silico modular gene system has showed that under-sampled expressions do not affect the recovery of gene co-expression network. Moreover, the cellular roles of 215 functionally unknown genes from yeast, E. coli and S. oneidensis are predicted by the gene co-expression networks using guilt-by-association principle, many of which are supported by existing information or our experimental verification, further demonstrating the reliability of this approach for gene function prediction. Conclusion: Our rigorous analysis of gene expression microarray profiles using RMT has showed that the transition of NNSD of correlation matrix of microarray profile provides a profound theoretical criterion to determine the correlation threshold for identifying gene co-expression networks.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1626342
Journal Information:
BMC Bioinformatics, Vol. 8, Issue 1; ISSN 1471-2105
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (56)

From molecular to modular cell biology journal December 1999
Biological Networks: The Tinkerer as an Engineer journal September 2003
Network biology: understanding the cell's functional organization journal February 2004
Biological networks journal April 2003
Network component analysis: Reconstruction of regulatory signals in biological systems journal December 2003
Reverse engineering of regulatory networks in human B cells journal March 2005
Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling journal July 2003
Reverse engineering gene networks using singular value decomposition and robust regression journal April 2002
Studying the Conditions for Learning Dynamic Bayesian Networks to Discover Genetic Regulatory Networks journal December 2003
Using Bayesian Networks to Analyze Expression Data journal August 2000
Elucidation of Gene Interaction Networks Through Time-Lagged Correlation Analysis of Transcriptional Data journal August 2004
Reverse-engineering transcription control networks journal March 2005
Transitive functional annotation by shortest-path analysis of gene expression data journal August 2002
Computational discovery of gene modules and regulatory networks journal October 2003
Revealing modular organization in the yeast transcriptional network journal July 2002
Adaptive quality-based clustering of gene expression profiles journal May 2002
Random Matrices in Physics journal January 1967
Statistical properties of the eigenvalue spectrum of the three-dimensional Anderson Hamiltonian journal December 1993
Level-Spacing Distributions of Planar Quasiperiodic Tight-Binding Models journal May 1998
Characterization of Chaotic Quantum Spectra and Universality of Level Fluctuation Laws journal January 1984
Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series journal August 1999
Comprehensive Identification of Cell Cycle–regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization journal December 1998
Missing value estimation methods for DNA microarrays journal June 2001
Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes journal December 2000
Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network journal May 2001
GENOMICS: Microarrays--Guilt by Association journal October 2003
Functional organization of the yeast proteome by systematic analysis of protein complexes journal January 2002
60S pre-ribosome formation viewed from assembly in the nucleolus until export to the cytoplasm journal October 2002
Global analysis of protein localization in budding yeast journal October 2003
Global landscape of protein complexes in the yeast Saccharomyces cerevisiae journal March 2006
Global Transcriptome Analysis of the Heat Shock Response of Shewanella oneidensis journal October 2004
A Gene Expression Map of the Arabidopsis Root journal December 2003
Gene Expression During the Life Cycle of Drosophila melanogaster journal September 2002
Genetics of gene expression surveyed in maize, mouse and man journal March 2003
Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer journal January 2001
A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules journal October 2003
Hierarchical Organization of Modularity in Metabolic Networks journal August 2002
A duplication growth model of gene expression networks journal November 2002
Microbial Functional Genomics book March 2004
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data journal May 2003
Application of random matrix theory to biological networks journal September 2006
Specificity and Stability in Topology of Protein Networks journal May 2002
Network motifs in the transcriptional regulation network of Escherichia coli journal April 2002
Iterative signature algorithm for the analysis of large-scale gene expression data journal March 2003
From Gene Networks to Gene Function journal December 2003
Preterm infants with isolated cerebellar hemorrhage show bilateral cortical alterations at term equivalent age journal March 2020
Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms journal January 2004
Application of random matrix theory to microarray data for discovering functional gene modules journal March 2006
The Escherichia coli metD Locus Encodes an ABC Transporter Which Includes Abc (MetN), YaeE (MetI), and YaeC (MetQ) journal October 2002
Transcriptomic and Proteomic Characterization of the Fur Modulon in the Metal-Reducing Bacterium Shewanella oneidensis journal December 2004
Identification of Genetic Networks from a Small Number of gene Expression Patterns Under the Boolean Network Model conference October 2013
Mutual Information Relevance Networks: Functional Genomic Clustering Using Pairwise Entropy Measurements conference August 2013
Studying the Conditions for Learning Dynamic Bayesian Networks to Discover Genetic Regulatory Networks journal December 2003
Gene networks from DNA microarray data: centrality and lethality preprint January 2002
Hierarchical organization of modularity in metabolic networks text January 2002
Application of Random Matrix Theory to Biological Networks text January 2005

Cited By (61)

Construction and comparison of gene co-expression networks shows complex plant immune responses journal January 2014
Molecular ecological network analyses journal January 2012
From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data journal March 2017
Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways journal January 2016
Chemotherapy Alters the Phylogenetic Molecular Ecological Networks of Intestinal Microbial Communities journal May 2019
Gene Coexpression Networks for the Analysis of DNA Microarray Data book April 2011
Continuous-cropping tobacco caused variance of chemical properties and structure of bacterial network in soils journal October 2018
Cross-correlations of American baby names journal June 2015
LPS-induced modules of co-expressed genes in equine peripheral blood mononuclear cells journal January 2017
Flooding Irrigation Weakens the Molecular Ecological Network Complexity of Soil Microbes during the Process of Dryland-to-Paddy Conversion journal January 2020
Methods for biological data integration: perspectives and challenges journal November 2015
Progressive Microbial Community Networks with Incremental Organic Loading Rates Underlie Higher Anaerobic Digestion Performance journal January 2020
Long noncoding RNAs expressed in human hepatic stellate cells form networks with extracellular matrix proteins journal March 2016
Network Medicine in the Age of Biomedical Big Data journal April 2019
Spectral properties of complex networks journal October 2018
Discovering Functions of Unannotated Genes from a Transcriptome Survey of Wild Fungal Isolates journal May 2014
Linking Binary Gene Relationships to Drivers of Renal Cell Carcinoma Reveals Convergent Function in Alternate Tumor Progression Paths journal February 2019
Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory journal February 2013
Integrated network analysis reveals the importance of microbial interactions for maize growth journal March 2018
Predicting links between tumor samples and genes using 2-Layered graph based diffusion approach journal September 2019
Spectral properties of the temporal evolution of brain network structure journal December 2015
Comparative study of RNA-seq- and Microarray-derived coexpression networks in Arabidopsis thaliana journal February 2013
Cracks Reinforce the Interactions among Soil Bacterial Communities in the Coal Mining Area of Loess Plateau, China journal December 2019
An Approach to Function Annotation for Proteins of Unknown Function (PUFs) in the Transcriptome of Indian Mulberry journal March 2016
Random Matrix Analysis for Gene Interaction Networks in Cancer Cells journal July 2018
Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network journal January 2011
Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks journal January 2019
Comparative co-expression analysis in plant biology: Comparative transcriptomics in plants journal May 2012
Network succession reveals the importance of competition in response to emulsified vegetable oil amendment for uranium bioremediation: Competition in bioremediation system journal August 2015
Fertilization shapes a well-organized community of bacterial decomposers for accelerated paddy straw degradation journal May 2018
Recycling RNA-Seq Data to Identify Candidate Orphan Genes for Experimental Analysis journal April 2020
The succession pattern of soil microbial communities and its relationship with tobacco bacterial wilt journal October 2016
LPS-induced modules of co-expressed genes in equine peripheral blood mononuclear cells text January 2017
Mutational Pleiotropy and the Strength of Stabilizing Selection Within and Between Functional Modules of Gene Expression journal April 2018
An integrated insight into the response of sedimentary microbial communities to heavy metal contamination journal September 2015
Random matrix analysis for gene interaction networks in cancer cells text January 2016
Spectral properties of complex networks text January 2018
Construction and validation of a gene co-expression network in grapevine (Vitis vinifera. L.) journal August 2014
Construction of citrus gene coexpression networks from microarray data using random matrix theory journal June 2015
Discovering Condition-Specific Gene Co-Expression Patterns Using Gaussian Mixture Models: A Cancer Case Study journal August 2017
A new avenue for obtaining insight into the functional characteristics of long noncoding RNAs associated with estrogen receptor signaling journal August 2016
The shift of microbial communities and their roles in sulfur and iron cycling in a copper ore bioleaching system journal October 2016
R. S. WebTool, a web server for random sampling-based significance evaluation of pairwise distances journal May 2014
Threshold selection in gene co-expression networks using spectral graph theory techniques journal October 2009
Expression-based network biology identifies immune-related functional modules involved in plant defense journal January 2014
Utilizing novel diversity estimators to quantify multiple dimensions of microbial biodiversity across domains journal January 2013
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study journal January 2013
The Thermoanaerobacter Glycobiome Reveals Mechanisms of Pentose and Hexose Co-Utilization in Bacteria journal October 2011
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules journal September 2012
Tracing Evolutionary Footprints to Identify Novel Gene Functional Linkages journal June 2013
A Systems-Genetics Approach and Data Mining Tool to Assist in the Discovery of Genes Underlying Complex Traits in Oryza sativa journal July 2013
A Null Model for Pearson Coexpression Networks journal June 2015
Comparative analysis of weighted gene co-expression networks in human and mouse journal November 2017
Discovery and validation of a glioblastoma co-expressed gene module journal January 2018
LncRNA ontology: inferring lncRNA functions based on chromatin states and expression patterns journal September 2015
Modelling business and management systems using Fuzzy cognitive maps: A critical overview conference November 2015
Modeling regulatory cascades using Artificial Neural Networks: the case of transcriptional regulatory networks shaped during the yeast stress response journal January 2013
Long-Term Oil Contamination Alters the Molecular Ecological Networks of Soil Microbial Functional Genes journal February 2016
Conservation of Species- and Trait-Based Modeling Network Interactions in Extremely Acidic Microbial Community Assembly journal August 2017
Investigating the Combinatory Effects of Biological Networks on Gene Co-expression journal May 2016
An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles journal April 2013