skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data

Journal Article · · Algorithms for Molecular Biology
 [1];  [2];  [3];  [4]
  1. Indian Inst. of Science, Bangalore (India). Dept. of Computer Science and Automation
  2. Indian Inst. of Science, Bangalore (India). Dept. of Computer Science and Automation; Indian Inst. of Science, Bangalore (India). Bioinformatics Centre
  3. Indian Inst. of Science, Bangalore (India). Bioinformatics Centre
  4. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Life Sciences Division

Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from highdimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l 1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the LeaveOne-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification.

Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1626634
Journal Information:
Algorithms for Molecular Biology, Vol. 4, Issue 1; ISSN 1748-7188
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (32)

The hardwiring of development: organization and function of genomic regulatory systems journal May 1997
Topological and causal structure of the yeast transcriptional regulatory network journal April 2002
Genomic analysis of regulatory network dynamics reveals large topological changes journal September 2004
Graphical Models journal February 2004
Modeling and Simulation of Genetic Regulatory Systems: A Literature Review journal January 2002
Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks journal February 2002
Reverse engineering gene networks using singular value decomposition and robust regression journal April 2002
Dialogue on Reverse-Engineering Assessment and Methods: The DREAM of High-Throughput Pathway Inference journal October 2007
Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks journal October 2000
Reverse engineering of regulatory networks in human B cells journal March 2005
ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context journal March 2006
An empirical Bayes approach to inferring large-scale gene association networks journal October 2004
Inferring Cellular Networks Using Probabilistic Graphical Models journal February 2004
Heuristic Approach to Sparse Approximation of Gene Regulatory Networks journal November 2008
Using Bayesian Networks to Analyze Expression Data journal August 2000
Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network conference December 2002
Regression Shrinkage and Selection Via the Lasso journal January 1996
Stochasticity in gene expression: from theories to phenotypes journal May 2005
Cluster analysis and display of genome-wide expression patterns journal December 1998
Stable recovery of sparse overcomplete representations in the presence of noise journal January 2006
Analysis of cellular responses to aflatoxin B1 in yeast expressing human cytochrome P450 1A2 using cDNA microarrays journal January 2006
Tetraspan vesicle membrane proteins: Synthesis, subcellular localization, and functional properties book January 2002
SYNGR1 is associated with schizophrenia and bipolar disorder in southern India journal October 2005
Convex optimization techniques for fitting sparse Gaussian graphical models conference January 2006
Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks journal January 2003
Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data journal April 2005
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems journal December 1998
Identification of Genetic Networks from a Small Number of gene Expression Patterns Under the Boolean Network Model conference October 2013
LINEAR MODELING OF mRNA EXPRESSION LEVELS DURING CNS DEVELOPMENT AND INJURY conference December 1998
Modeling Regulatory Networks with Weight Matrices conference December 1998
Using Bayesian networks to analyze expression data
  • Friedman, Nir; Linial, Michal; Nachman, Iftach
  • Proceedings of the fourth annual international conference on Computational molecular biology - RECOMB '00 https://doi.org/10.1145/332306.332355
conference January 2000
Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data text January 2000