DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

Abstract

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-means clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our methodmore » for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

Authors:
 [1];  [2];  [2];  [3];  [4];  [2];  [5];  [6];  [6];  [7];  [8];  [9];  [10];  [11];  [12]
  1. Univ. of Chicago, IL (United States). Computation Inst.; Argonne National Lab. (ANL), Argonne, IL (United States). Computing, Environment and Life Sciences and Mathematics and Computer Science Division; Univ. of Minho, Braga (Portugal). Centre of Biological Engineering
  2. Univ. of Chicago, IL (United States). Computation Inst.; Argonne National Lab. (ANL), Argonne, IL (United States). Computing, Environment and Life Sciences
  3. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Computational Biology and Bioinformatics Group
  4. Argonne National Lab. (ANL), Argonne, IL (United States). Mathematics and Computer Science Division
  5. Univ. of Chicago, IL (United States). Computation Inst. and Dept. of Computer Science; Argonne National Lab. (ANL), Argonne, IL (United States). Computing, Environment and Life Sciences
  6. Univ. of Minho, Braga (Portugal). Centre of Biological Engineering
  7. Hope College, Holland, MI (United States). Biology Dept.
  8. Hope College, Holland, MI (United States). Computer Science Dept.
  9. Dordt College, Sioux Center, IA (United States). Dept. of Mathematics
  10. Argonne National Lab. (ANL), Argonne, IL (United States). Computing, Environment and Life Sciences; Fellowship for Interpretation of Genomes, Burr Ridge, IL (United States)
  11. Univ. of Chicago, IL (United States). Computation Inst.; Argonne National Lab. (ANL), Argonne, IL (United States). Computing, Environment and Life Sciences; Fellowship for Interpretation of Genomes, Burr Ridge, IL (United States)
  12. Univ. of Chicago, IL (United States). Computation Inst.; Argonne National Lab. (ANL), Argonne, IL (United States). Mathematics and Computer Science Division
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States); Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Institutes of Health (NIH). National Institute of Allergy and Infectious Diseases (NIAID); U. S. Department of Health and Human Services; National Science Foundation (NSF); Fundacao para a Ciencia ea Tecnologia of Portugal
OSTI Identifier:
1372299
Alternate Identifier(s):
OSTI ID: 1339825
Report Number(s):
PNNL-SA-115054
Journal ID: ISSN 1664-302X; 131302
Grant/Contract Number:  
AC02-06CH11357; AC05-76RL01830
Resource Type:
Accepted Manuscript
Journal Name:
Frontiers in Microbiology
Additional Journal Information:
Journal Volume: 7; Journal ID: ISSN 1664-302X
Publisher:
Frontiers Research Foundation
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; 59 BASIC BIOLOGICAL SCIENCES; Atomic Regulon; CLR; Escherichia coli; clustering; gene expression analysis; hierarchical clustering; k-means clustering; transcriptomic data; atomic regulon

Citation Formats

Faria, José P., Davis, James J., Edirisinghe, Janaka N., Taylor, Ronald C., Weisenhorn, Pamela, Olson, Robert D., Stevens, Rick L., Rocha, Miguel, Rocha, Isabel, Best, Aaron A., DeJongh, Matthew, Tintle, Nathan L., Parrello, Bruce, Overbeek, Ross, and Henry, Christopher S. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation. United States: N. p., 2016. Web. doi:10.3389/fmicb.2016.01819.
Faria, José P., Davis, James J., Edirisinghe, Janaka N., Taylor, Ronald C., Weisenhorn, Pamela, Olson, Robert D., Stevens, Rick L., Rocha, Miguel, Rocha, Isabel, Best, Aaron A., DeJongh, Matthew, Tintle, Nathan L., Parrello, Bruce, Overbeek, Ross, & Henry, Christopher S. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation. United States. https://doi.org/10.3389/fmicb.2016.01819
Faria, José P., Davis, James J., Edirisinghe, Janaka N., Taylor, Ronald C., Weisenhorn, Pamela, Olson, Robert D., Stevens, Rick L., Rocha, Miguel, Rocha, Isabel, Best, Aaron A., DeJongh, Matthew, Tintle, Nathan L., Parrello, Bruce, Overbeek, Ross, and Henry, Christopher S. Thu . "Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation". United States. https://doi.org/10.3389/fmicb.2016.01819. https://www.osti.gov/servlets/purl/1372299.
@article{osti_1372299,
title = {Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation},
author = {Faria, José P. and Davis, James J. and Edirisinghe, Janaka N. and Taylor, Ronald C. and Weisenhorn, Pamela and Olson, Robert D. and Stevens, Rick L. and Rocha, Miguel and Rocha, Isabel and Best, Aaron A. and DeJongh, Matthew and Tintle, Nathan L. and Parrello, Bruce and Overbeek, Ross and Henry, Christopher S.},
abstractNote = {Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-means clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.},
doi = {10.3389/fmicb.2016.01819},
journal = {Frontiers in Microbiology},
number = ,
volume = 7,
place = {United States},
year = {Thu Nov 24 00:00:00 EST 2016},
month = {Thu Nov 24 00:00:00 EST 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
journal, October 2014

  • Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.
  • PLoS Computational Biology, Vol. 10, Issue 10
  • DOI: 10.1371/journal.pcbi.1003882

STRING: known and predicted protein-protein associations, integrated and transferred across organisms
journal, December 2004

  • von Mering, C.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki005

The Gene Locus of the Proton-translocating NADH : Ubiquinone Oxidoreductase in Escherichia coli
journal, September 1993

  • Weidner, Uwe; Geier, Stephan; Ptock, Arne
  • Journal of Molecular Biology, Vol. 233, Issue 1
  • DOI: 10.1006/jmbi.1993.1488

Somewhat in control—the role of transcription in regulating microbial metabolic fluxes
journal, December 2013


The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes
journal, September 2005


Escherichia coli genes required for cytochrome c maturation.
journal, August 1995


Least squares quantization in PCM
journal, March 1982


RegulonDB: a database on transcriptional regulation in Escherichia coli
journal, January 1998


Condition-Dependent Transcriptome Reveals High-Level Regulatory Architecture in Bacillus subtilis
journal, March 2012


Systems Biology: A Brief Overview
journal, March 2002


Operons in Escherichia coli: Genomic analyses and predictions
journal, May 2000

  • Salgado, H.; Moreno-Hagelsieb, G.; Smith, T. F.
  • Proceedings of the National Academy of Sciences, Vol. 97, Issue 12
  • DOI: 10.1073/pnas.110147297

Evolutionary Dynamics of Prokaryotic Transcriptional Regulatory Networks
journal, April 2006

  • Madan Babu, M.; Teichmann, Sarah A.; Aravind, L.
  • Journal of Molecular Biology, Vol. 358, Issue 2
  • DOI: 10.1016/j.jmb.2006.02.019

RNA-Seq: a revolutionary tool for transcriptomics
journal, January 2009

  • Wang, Zhong; Gerstein, Mark; Snyder, Michael
  • Nature Reviews Genetics, Vol. 10, Issue 1
  • DOI: 10.1038/nrg2484

DeGNServer: Deciphering Genome-Scale Gene Networks through High Performance Reverse Engineering Analysis
journal, January 2013

  • Li, Jun; Wei, Hairong; Zhao, Patrick Xuechun
  • BioMed Research International, Vol. 2013
  • DOI: 10.1155/2013/856325

PATRIC, the bacterial bioinformatics database and analysis resource
journal, November 2013

  • Wattam, Alice R.; Abraham, David; Dalay, Oral
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1099

Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data
journal, March 2016


Exploration, normalization, and summaries of high density oligonucleotide array probe level data
journal, April 2003


Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
journal, August 2012

  • Tintle, Nathan L.; Sitarik, Alexandra; Boerema, Benjamin
  • BMC Bioinformatics, Vol. 13, Issue 1
  • DOI: 10.1186/1471-2105-13-193

Biomedical Discovery with DNA Arrays
journal, July 2000


Probabilistic integrative modeling of genome-scale metabolic and regulatory networks in Escherichia coli and Mycobacterium tuberculosis
journal, September 2010

  • Chandrasekaran, Sriram; Price, Nathan D.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 41
  • DOI: 10.1073/pnas.1005139107

Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata
journal, December 2007

  • Faith, J. J.; Driscoll, M. E.; Fusaro, V. A.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm815

Nucleotide sequence of the genes involved in phosphate transport and regulation of the phosphate regulon in Escherichia coli
journal, July 1985


RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach
journal, June 2010

  • Novichkov, P. S.; Rodionov, D. A.; Stavrovskaya, E. D.
  • Nucleic Acids Research, Vol. 38, Issue Web Server
  • DOI: 10.1093/nar/gkq531

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
journal, November 2013

  • Overbeek, Ross; Olson, Robert; Pusch, Gordon D.
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1226

Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
journal, January 2002


A genome‐scale computational study of the interplay between transcriptional regulation and metabolism
journal, January 2007

  • Shlomi, Tomer; Eisenberg, Yariv; Sharan, Roded
  • Molecular Systems Biology, Vol. 3, Issue 1
  • DOI: 10.1038/msb4100141

The Proton-translocating NADH-Quinone Oxidoreductase (NDH-1) of Thermophilic Bacterium Thermus thermophilus HB-8
journal, February 1997

  • Yano, Takahiro; Chu, Samuel S.; Sled', Vladimir D.
  • Journal of Biological Chemistry, Vol. 272, Issue 7
  • DOI: 10.1074/jbc.272.7.4201

Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles
journal, January 2007


Comparative Genomic Reconstruction of Transcriptional Regulatory Networks in Bacteria
journal, July 2007


Advantages and limitations of current network inference methods
journal, August 2010

  • De Smet, Riet; Marchal, Kathleen
  • Nature Reviews Microbiology, Vol. 8, Issue 10
  • DOI: 10.1038/nrmicro2419

RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more
journal, November 2012

  • Salgado, Heladia; Peralta-Gil, Martin; Gama-Castro, Socorro
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1201

Nucleotide sequence of the genes involved in phosphate transport and regulation of the phosphate regulon in Escherichia coli
journal, July 1985


Somewhat in control—the role of transcription in regulating microbial metabolic fluxes
journal, December 2013


Evolutionary Dynamics of Prokaryotic Transcriptional Regulatory Networks
journal, April 2006

  • Madan Babu, M.; Teichmann, Sarah A.; Aravind, L.
  • Journal of Molecular Biology, Vol. 358, Issue 2
  • DOI: 10.1016/j.jmb.2006.02.019

Comparative Genomic Reconstruction of Transcriptional Regulatory Networks in Bacteria
journal, July 2007


A genome‐scale computational study of the interplay between transcriptional regulation and metabolism
journal, January 2007

  • Shlomi, Tomer; Eisenberg, Yariv; Sharan, Roded
  • Molecular Systems Biology, Vol. 3, Issue 1
  • DOI: 10.1038/msb4100141

RNA-Seq: a revolutionary tool for transcriptomics
journal, January 2009

  • Wang, Zhong; Gerstein, Mark; Snyder, Michael
  • Nature Reviews Genetics, Vol. 10, Issue 1
  • DOI: 10.1038/nrg2484

Advantages and limitations of current network inference methods
journal, August 2010

  • De Smet, Riet; Marchal, Kathleen
  • Nature Reviews Microbiology, Vol. 8, Issue 10
  • DOI: 10.1038/nrmicro2419

Synergism between IL7R and CXCR4 drives BCR-ABL induced transformation in Philadelphia chromosome-positive acute lymphoblastic leukemia
journal, June 2020


Probabilistic integrative modeling of genome-scale metabolic and regulatory networks in Escherichia coli and Mycobacterium tuberculosis
journal, September 2010

  • Chandrasekaran, Sriram; Price, Nathan D.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 41
  • DOI: 10.1073/pnas.1005139107

Operons in Escherichia coli: Genomic analyses and predictions
journal, May 2000

  • Salgado, H.; Moreno-Hagelsieb, G.; Smith, T. F.
  • Proceedings of the National Academy of Sciences, Vol. 97, Issue 12
  • DOI: 10.1073/pnas.110147297

Exploration, normalization, and summaries of high density oligonucleotide array probe level data
journal, April 2003


RegulonDB: a database on transcriptional regulation in Escherichia coli
journal, January 1998


STRING: known and predicted protein-protein associations, integrated and transferred across organisms
journal, December 2004

  • von Mering, C.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki005

The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes
journal, September 2005


Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata
journal, December 2007

  • Faith, J. J.; Driscoll, M. E.; Fusaro, V. A.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm815

RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach
journal, June 2010

  • Novichkov, P. S.; Rodionov, D. A.; Stavrovskaya, E. D.
  • Nucleic Acids Research, Vol. 38, Issue Web Server
  • DOI: 10.1093/nar/gkq531

RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more
journal, November 2012

  • Salgado, Heladia; Peralta-Gil, Martin; Gama-Castro, Socorro
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1201

PATRIC, the bacterial bioinformatics database and analysis resource
journal, November 2013

  • Wattam, Alice R.; Abraham, David; Dalay, Oral
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1099

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
journal, November 2013

  • Overbeek, Ross; Olson, Robert; Pusch, Gordon D.
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1226

Condition-Dependent Transcriptome Reveals High-Level Regulatory Architecture in Bacillus subtilis
journal, March 2012


Escherichia coli genes required for cytochrome c maturation.
journal, August 1995


Ancient Origin of the Tryptophan Operon and the Dynamics of Evolutionary Change
journal, September 2003


DeGNServer: Deciphering Genome-Scale Gene Networks through High Performance Reverse Engineering Analysis
journal, January 2013

  • Li, Jun; Wei, Hairong; Zhao, Patrick Xuechun
  • BioMed Research International, Vol. 2013
  • DOI: 10.1155/2013/856325

Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
journal, August 2012

  • Tintle, Nathan L.; Sitarik, Alexandra; Boerema, Benjamin
  • BMC Bioinformatics, Vol. 13, Issue 1
  • DOI: 10.1186/1471-2105-13-193

A genome-scale metabolic flux model of Escherichia coli K–12 derived from the EcoCyc database
journal, January 2014

  • Weaver, Daniel S.; Keseler, Ingrid M.; Mackie, Amanda
  • BMC Systems Biology, Vol. 8, Issue 1
  • DOI: 10.1186/1752-0509-8-79

Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles
journal, January 2007


Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
journal, October 2014

  • Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.
  • PLoS Computational Biology, Vol. 10, Issue 10
  • DOI: 10.1371/journal.pcbi.1003882

Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data
journal, March 2016


Works referencing / citing this record:

KBase: The United States Department of Energy Systems Biology Knowledgebase
journal, July 2018

  • Arkin, Adam P.; Cottingham, Robert W.; Henry, Christopher S.
  • Nature Biotechnology, Vol. 36, Issue 7
  • DOI: 10.1038/nbt.4163

AGeNNT: annotation of enzyme families by means of refined neighborhood networks
text, January 2017

  • Kandlinger, Florian; Plach, Maximilian G.; Merkl, Rainer
  • Universität Regensburg
  • DOI: 10.5283/epub.36657