DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Efficient α, β-motif finder for identification of phenotype-related functional modules

Abstract

Background: Microbial communities in their natural environments exhibit phenotypes that can directly cause particular diseases, convert biomass or wastewater to energy, or degrade various environmental contaminants. Understanding how these communities realize specific phenotypic traits (e.g., carbon fixation, hydrogen production) is critical for addressing health, bioremediation, or bioenergy problems. Results: In this paper, we describe a graph-theoretical method for in silico prediction of the cellular subsystems that are related to the expression of a target phenotype. The proposed (a, b)-motif finder approach allows for identification of these phenotype-related subsystems that, in addition to metabolic subsystems, could include their regulators, sensors, transporters, and even uncharacterized proteins. By comparing dozens of genome-scale networks of functionally associated proteins, our method efficiently identifies those statistically significant functional modules that are in at least a networks of phenotype-expressing organisms but appear in no more than b networks of organisms that do not exhibit the target phenotype. It has been shown via various experiments that the enumerated modules are indeed related to phenotype-expression when tested with different target phenotypes like hydrogen production, motility, aerobic respiration, and acid-tolerance. Conclusion: Thus, we have proposed a methodology that can identify potential statistically significant phenotyperelated functional modules. The functional module ismore » modeled as an (a, b)-clique, where a and b are two criteria introduced in this work. We also propose a novel network model, called the two-typed, divided network. The new network model and the criteria make the problem tractable even while very large networks are being compared. The code can be downloaded from http://www.freescience.org/cs/ABClique/« less

Authors:
 [1];  [2];  [1];  [1];  [3];  [2];  [1]
  1. North Carolina State Univ., Raleigh, NC (United States). Dept. of Computer Science; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
  2. Univ. of South Florida, Tampa, FL (United States). Dept. of Civil and Environmental Engineering
  3. Univ. of South Florida, Tampa, FL (United States). Dept. of Integrative Biology
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
OSTI Identifier:
1626280
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 12; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; 97 MATHEMATICS AND COMPUTING; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Mathematical & Computational Biology

Citation Formats

Schmidt, Matthew C., Rocha, Andrea M., Padmanabhan, Kanchana, Chen, Zhengzhang, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. Efficient α, β-motif finder for identification of phenotype-related functional modules. United States: N. p., 2011. Web. doi:10.1186/1471-2105-12-440.
Schmidt, Matthew C., Rocha, Andrea M., Padmanabhan, Kanchana, Chen, Zhengzhang, Scott, Kathleen, Mihelcic, James R., & Samatova, Nagiza F. Efficient α, β-motif finder for identification of phenotype-related functional modules. United States. https://doi.org/10.1186/1471-2105-12-440
Schmidt, Matthew C., Rocha, Andrea M., Padmanabhan, Kanchana, Chen, Zhengzhang, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. Sat . "Efficient α, β-motif finder for identification of phenotype-related functional modules". United States. https://doi.org/10.1186/1471-2105-12-440. https://www.osti.gov/servlets/purl/1626280.
@article{osti_1626280,
title = {Efficient α, β-motif finder for identification of phenotype-related functional modules},
author = {Schmidt, Matthew C. and Rocha, Andrea M. and Padmanabhan, Kanchana and Chen, Zhengzhang and Scott, Kathleen and Mihelcic, James R. and Samatova, Nagiza F.},
abstractNote = {Background: Microbial communities in their natural environments exhibit phenotypes that can directly cause particular diseases, convert biomass or wastewater to energy, or degrade various environmental contaminants. Understanding how these communities realize specific phenotypic traits (e.g., carbon fixation, hydrogen production) is critical for addressing health, bioremediation, or bioenergy problems. Results: In this paper, we describe a graph-theoretical method for in silico prediction of the cellular subsystems that are related to the expression of a target phenotype. The proposed (a, b)-motif finder approach allows for identification of these phenotype-related subsystems that, in addition to metabolic subsystems, could include their regulators, sensors, transporters, and even uncharacterized proteins. By comparing dozens of genome-scale networks of functionally associated proteins, our method efficiently identifies those statistically significant functional modules that are in at least a networks of phenotype-expressing organisms but appear in no more than b networks of organisms that do not exhibit the target phenotype. It has been shown via various experiments that the enumerated modules are indeed related to phenotype-expression when tested with different target phenotypes like hydrogen production, motility, aerobic respiration, and acid-tolerance. Conclusion: Thus, we have proposed a methodology that can identify potential statistically significant phenotyperelated functional modules. The functional module is modeled as an (a, b)-clique, where a and b are two criteria introduced in this work. We also propose a novel network model, called the two-typed, divided network. The new network model and the criteria make the problem tractable even while very large networks are being compared. The code can be downloaded from http://www.freescience.org/cs/ABClique/},
doi = {10.1186/1471-2105-12-440},
journal = {BMC Bioinformatics},
number = 1,
volume = 12,
place = {United States},
year = {Sat Jan 01 00:00:00 EST 2011},
month = {Sat Jan 01 00:00:00 EST 2011}
}

Works referenced in this record:

Fast and Accurate Method for Identifying High-Quality Protein-Interaction Modules by Clique Merging and Its Application to Yeast
journal, April 2006

  • Zhang, Chi; Liu, Song; Zhou, Yaoqi
  • Journal of Proteome Research, Vol. 5, Issue 4
  • DOI: 10.1021/pr050366g

The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004

  • Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
  • Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
  • DOI: 10.1016/j.jmb.2004.09.040

The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004

  • Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
  • Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
  • DOI: 10.1016/j.jmb.2004.09.040

Redirection of Metabolism for Biological Hydrogen Production
journal, January 2007

  • Rey, F. E.; Heiniger, E. K.; Harwood, C. S.
  • Applied and Environmental Microbiology, Vol. 73, Issue 5
  • DOI: 10.1128/AEM.02565-06

Algorithm 457: finding all cliques of an undirected graph
journal, September 1973


Molecular Evolution of Nitrogen Fixation: The Evolutionary History of the nifD, nifK, nifE, and nifN Genes
journal, July 2000

  • Fani, Renato; Gallo, Romina; Liò, Pietro
  • Journal of Molecular Evolution, Vol. 51, Issue 1
  • DOI: 10.1007/s002390010061

Fast and Accurate Method for Identifying High-Quality Protein-Interaction Modules by Clique Merging and Its Application to Yeast
journal, April 2006

  • Zhang, Chi; Liu, Song; Zhou, Yaoqi
  • Journal of Proteome Research, Vol. 5, Issue 4
  • DOI: 10.1021/pr050366g

Escherichia coli acid resistance: tales of an amateur acidophile
journal, November 2004


MS2Grouper: Group assessment and synthetic replacement of duplicate proteomic tandem mass spectra
journal, August 2005

  • Tabb, David L.; Thompson, Melissa R.; Khalsa-Moyers, Gurusahai
  • Journal of the American Society for Mass Spectrometry, Vol. 16, Issue 8
  • DOI: 10.1016/j.jasms.2005.04.010

A scalable, parallel algorithm for maximal clique enumeration
journal, April 2009

  • Schmidt, Matthew C.; Samatova, Nagiza F.; Thomas, Kevin
  • Journal of Parallel and Distributed Computing, Vol. 69, Issue 4
  • DOI: 10.1016/j.jpdc.2009.01.003

Regulation of Uptake Hydrogenase and Effects of Hydrogen Utilization on Gene Expression in Rhodopseudomonas palustris
journal, August 2006

  • Rey, F. E.; Oda, Y.; Harwood, C. S.
  • Journal of Bacteriology, Vol. 188, Issue 17
  • DOI: 10.1128/jb.00381-06

From Genotype to Phenotype: Systems Biology Meets Natural Variation
journal, April 2008


Context-Dependent Functions of the PII and GlnK Signal Transduction Proteins in Escherichia coli
journal, October 2002


Ab initio genotype–phenotype association reveals intrinsic modularity in genetic networks
journal, January 2006

  • Slonim, Noam; Elemento, Olivier; Tavazoie, Saeed
  • Molecular Systems Biology, Vol. 2, Issue 1
  • DOI: 10.1038/msb4100047

Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles
journal, April 1999

  • Pellegrini, M.; Marcotte, E. M.; Thompson, M. J.
  • Proceedings of the National Academy of Sciences, Vol. 96, Issue 8
  • DOI: 10.1073/pnas.96.8.4285

Escherichia coli acid resistance: tales of an amateur acidophile
journal, November 2004


MS2Grouper: Group assessment and synthetic replacement of duplicate proteomic tandem mass spectra
journal, August 2005

  • Tabb, David L.; Thompson, Melissa R.; Khalsa-Moyers, Gurusahai
  • Journal of the American Society for Mass Spectrometry, Vol. 16, Issue 8
  • DOI: 10.1016/j.jasms.2005.04.010

Getting connected: analysis and principles of biological networks
journal, April 2007

  • Zhu, X.; Gerstein, M.; Snyder, M.
  • Genes & Development, Vol. 21, Issue 9
  • DOI: 10.1101/gad.1528707

Interaction network containing conserved and essential protein complexes in Escherichia coli
journal, February 2005

  • Butland, Gareth; Peregrín-Alvarez, José Manuel; Li, Joyce
  • Nature, Vol. 433, Issue 7025
  • DOI: 10.1038/nature03239

The integrated microbial genomes (IMG) system
journal, January 2006


Interactions of the Escherichia coli hydrogenase biosynthetic proteins: HybG complex formation
journal, December 2005


l-Lysine Catabolism Is Controlled by l-Arginine and ArgR in Pseudomonas aeruginosa PAO1
journal, September 2010

  • Chou, Han Ting; Hegazy, Mohamed; Lu, Chung-Dar
  • Journal of Bacteriology, Vol. 192, Issue 22
  • DOI: 10.1128/jb.00673-10

Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles
journal, April 1999

  • Pellegrini, M.; Marcotte, E. M.; Thompson, M. J.
  • Proceedings of the National Academy of Sciences, Vol. 96, Issue 8
  • DOI: 10.1073/pnas.96.8.4285

Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007

  • Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
  • Journal of Molecular Biology, Vol. 372, Issue 4
  • DOI: 10.1016/j.jmb.2007.07.023

Trait-to-Gene
journal, January 2003


Network‐based prediction of protein function
journal, January 2007

  • Sharan, Roded; Ulitsky, Igor; Shamir, Ron
  • Molecular Systems Biology, Vol. 3, Issue 1
  • DOI: 10.1038/msb4100129

The COG database: an updated version includes eukaryotes
journal, January 2003

  • Tatusov, Roman L.; Fedorova, Natalie D.; Jackson, John D.
  • BMC Bioinformatics, Vol. 4, Article No. 41
  • DOI: 10.1186/1471-2105-4-41

Interaction network containing conserved and essential protein complexes in Escherichia coli
journal, February 2005

  • Butland, Gareth; Peregrín-Alvarez, José Manuel; Li, Joyce
  • Nature, Vol. 433, Issue 7025
  • DOI: 10.1038/nature03239

An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits
journal, January 2006


From Genotype to Phenotype: Systems Biology Meets Natural Variation
journal, April 2008


From pull-down data to protein interaction networks and complexes with biological relevance
journal, February 2008


From pull-down data to protein interaction networks and complexes with biological relevance
journal, February 2008


Getting connected: analysis and principles of biological networks
journal, April 2007

  • Zhu, X.; Gerstein, M.; Snyder, M.
  • Genes & Development, Vol. 21, Issue 9
  • DOI: 10.1101/gad.1528707

An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits
journal, January 2006


Comparative in-silico proteomic analysis discerns potential granuloma proteins of Yersinia pseudotuberculosis
journal, February 2020


Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007

  • Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
  • Journal of Molecular Biology, Vol. 372, Issue 4
  • DOI: 10.1016/j.jmb.2007.07.023

The lysP gene encodes the lysine-specific permease.
journal, May 1992


Network‐based prediction of protein function
journal, January 2007

  • Sharan, Roded; Ulitsky, Igor; Shamir, Ron
  • Molecular Systems Biology, Vol. 3, Issue 1
  • DOI: 10.1038/msb4100129

STRING 8--a global view on proteins and their functional interactions in 630 organisms
journal, January 2009

  • Jensen, L. J.; Kuhn, M.; Stark, M.
  • Nucleic Acids Research, Vol. 37, Issue Database
  • DOI: 10.1093/nar/gkn760

Classification and phylogeny of hydrogenases
journal, August 2001


Molecular Evolution of Nitrogen Fixation: The Evolutionary History of the nifD, nifK, nifE, and nifN Genes
journal, July 2000

  • Fani, Renato; Gallo, Romina; Liò, Pietro
  • Journal of Molecular Evolution, Vol. 51, Issue 1
  • DOI: 10.1007/s002390010061

The lysP gene encodes the lysine-specific permease.
journal, May 1992


Algorithm 457: finding all cliques of an undirected graph
journal, September 1973


On cliques in graphs
journal, March 1965

  • Moon, J. W.; Moser, L.
  • Israel Journal of Mathematics, Vol. 3, Issue 1
  • DOI: 10.1007/BF02760024

A scalable, parallel algorithm for maximal clique enumeration
journal, April 2009

  • Schmidt, Matthew C.; Samatova, Nagiza F.; Thomas, Kevin
  • Journal of Parallel and Distributed Computing, Vol. 69, Issue 4
  • DOI: 10.1016/j.jpdc.2009.01.003

Works referencing / citing this record:

Quantitative assessment of gene expression network module-validation methods
journal, October 2015

  • Li, Bing; Zhang, Yingying; Yu, Yanan
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep15258

Quantitative assessment of gene expression network module-validation methods
journal, October 2015

  • Li, Bing; Zhang, Yingying; Yu, Yanan
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep15258