Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities
- Department of Bioengineering, University of California at San Diego, La Jolla, CA 92093,
- Department of Bioengineering, University of California at San Diego, La Jolla, CA 92093,, Bioinformatics and Systems Biology Program, University of California at San Diego, La Jolla, CA 92093,
- Department of Genetic Engineering, Kyung Hee University, Yongin 17104, South Korea,
- Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093,
- Department of Bioengineering, University of California at San Diego, La Jolla, CA 92093,, Bioinformatics and Systems Biology Program, University of California at San Diego, La Jolla, CA 92093,, Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2970 Horsholm, Denmark,, Department of Pediatrics, University of California at San Diego, La Jolla, CA 92093
Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types.
- Research Organization:
- Univ. of California, San Diego, CA (United States)
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- SC0008701
- OSTI ID:
- 1378781
- Alternate ID(s):
- OSTI ID: 1465783
- Journal Information:
- Proceedings of the National Academy of Sciences of the United States of America, Journal Name: Proceedings of the National Academy of Sciences of the United States of America Vol. 114 Journal Issue: 38; ISSN 0027-8424
- Publisher:
- Proceedings of the National Academy of SciencesCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
Transcriptional regulatory network refinement and quantification through kinetic modeling, gene expression microarray data and information theory
Transcriptional regulatory network discovery via multiple method integration: application to e. coli K12