skip to main content


Title: An integrated approach to reconstructing genome-scale transcriptional regulatory networks

Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showedmore » that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less
 [1] ;  [1] ;  [1] ;  [2]
  1. Univ. of Wisconsin, Madison, WI (United States)
  2. Memorial Sloan-Kettering Cancer Center, New York, NY (United States)
Publication Date:
Grant/Contract Number:
FC02-07ER64494; FG02-04ER25627
Accepted Manuscript
Journal Name:
PLoS Computational Biology (Online)
Additional Journal Information:
Journal Name: PLoS Computational Biology (Online); Journal Volume: 11; Journal Issue: 2; Journal ID: ISSN 1553-7358
Public Library of Science
Research Org:
Univ. of Wisconsin, Madison, WI (United States)
Sponsoring Org:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22); USDOE EFRC
Country of Publication:
United States
OSTI Identifier: