DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules

Abstract

Background: Identifying cellular subsystems that are involved in the expression of a target phenotype has been a very active research area for the past several years. In this paper, cellular subsystem refers to a group of genes (or proteins) that interact and carry out a common function in the cell. Most studies identify genes associated with a phenotype on the basis of some statistical bias, others have extended these statistical methods to analyze functional modules and biological pathways for phenotype-relatedness. However, a biologist might often have a specific question in mind while performing such analysis and most of the resulting subsystems obtained by the existing methods might be largely irrelevant to the question in hand. Arguably, it would be valuable to incorporate biologist’s knowledge about the phenotype into the algorithm. This way, it is anticipated that the resulting subsytems would not only be related to the target phenotype but also contain information that the biologist is likely to be interested in. Results: In this paper we introduce a fast and theoretically guranteed method called DENSE (Dense and ENriched Subgraph Enumeration) that can take in as input a biologist’s prior knowledge as a set of query proteins and identify all themore » dense functional modules in a biological network that contain some part of the query vertices. The density (in terms of the number of network egdes) and the enrichment (the number of query proteins in the resulting functional module) can be manipulated via two parameters g and μ, respectively. Conclusion: This algorithm has been applied to the protein functional association network of Clostridium acetobutylicum ATCC 824, a hydrogen producing, acid-tolerant organism. The algorithm was able to verify relationships known to exist in literature and also some previously unknown relationships including those with regulatory and signaling functions. Additionally, we were also able to hypothesize that some uncharacterized proteins are likely associated with the target phenotype. The DENSE code can be downloaded from http://www. freescience.org/cs/DENSE/« less

Authors:
 [1];  [2];  [1];  [3];  [4];  [2];  [1]
  1. North Carolina State Univ., Raleigh, NC (United States). Dept. of Computer Science; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
  2. Univ. of South Florida, Tampa, FL (United States). Dept. of Civil and Environment Engineering
  3. Northwestern Univ., Evanston, IL (United States). Dept. of Electrical Engineering and Computer Science
  4. Univ. of South Florida, Tampa, FL (United States). Dept. of Integrative Biology
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States); Northwestern Univ., Evanston, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF)
OSTI Identifier:
1626645
Grant/Contract Number:  
AC05-00OR22725; FC02-07ER25808; FG02-08ER25848; SC0001283; SC0005309; SC0005340; OCI-0724599; CNS-0830927; CCF-0621443; CCF-0833131; CCF-0938000; CCF-1029166; CCF-1043085
Resource Type:
Accepted Manuscript
Journal Name:
BMC Systems Biology
Additional Journal Information:
Journal Volume: 5; Journal Issue: 1; Journal ID: ISSN 1752-0509
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 62 RADIOLOGY AND NUCLEAR MEDICINE; 97 MATHEMATICS AND COMPUTING; Mathematical & computational biology; Hydrogen Production; Dense Subgraph; Pyruvate Formate Lyase; Bitmap Index; Target Phenotype

Citation Formats

Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules. United States: N. p., 2011. Web. doi:10.1186/1752-0509-5-172.
Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., & Samatova, Nagiza F. DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules. United States. https://doi.org/10.1186/1752-0509-5-172
Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. Mon . "DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules". United States. https://doi.org/10.1186/1752-0509-5-172. https://www.osti.gov/servlets/purl/1626645.
@article{osti_1626645,
title = {DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules},
author = {Hendrix, Willam and Rocha, Andrea M. and Padmanabhan, Kanchana and Choudhary, Alok and Scott, Kathleen and Mihelcic, James R. and Samatova, Nagiza F.},
abstractNote = {Background: Identifying cellular subsystems that are involved in the expression of a target phenotype has been a very active research area for the past several years. In this paper, cellular subsystem refers to a group of genes (or proteins) that interact and carry out a common function in the cell. Most studies identify genes associated with a phenotype on the basis of some statistical bias, others have extended these statistical methods to analyze functional modules and biological pathways for phenotype-relatedness. However, a biologist might often have a specific question in mind while performing such analysis and most of the resulting subsystems obtained by the existing methods might be largely irrelevant to the question in hand. Arguably, it would be valuable to incorporate biologist’s knowledge about the phenotype into the algorithm. This way, it is anticipated that the resulting subsytems would not only be related to the target phenotype but also contain information that the biologist is likely to be interested in. Results: In this paper we introduce a fast and theoretically guranteed method called DENSE (Dense and ENriched Subgraph Enumeration) that can take in as input a biologist’s prior knowledge as a set of query proteins and identify all the dense functional modules in a biological network that contain some part of the query vertices. The density (in terms of the number of network egdes) and the enrichment (the number of query proteins in the resulting functional module) can be manipulated via two parameters g and μ, respectively. Conclusion: This algorithm has been applied to the protein functional association network of Clostridium acetobutylicum ATCC 824, a hydrogen producing, acid-tolerant organism. The algorithm was able to verify relationships known to exist in literature and also some previously unknown relationships including those with regulatory and signaling functions. Additionally, we were also able to hypothesize that some uncharacterized proteins are likely associated with the target phenotype. The DENSE code can be downloaded from http://www. freescience.org/cs/DENSE/},
doi = {10.1186/1752-0509-5-172},
journal = {BMC Systems Biology},
number = 1,
volume = 5,
place = {United States},
year = {Mon Jan 24 00:00:00 EST 2011},
month = {Mon Jan 24 00:00:00 EST 2011}
}

Works referenced in this record:

Redirection of Metabolism for Biological Hydrogen Production
journal, January 2007

  • Rey, F. E.; Heiniger, E. K.; Harwood, C. S.
  • Applied and Environmental Microbiology, Vol. 73, Issue 5
  • DOI: 10.1128/AEM.02565-06

The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004

  • Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
  • Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
  • DOI: 10.1016/j.jmb.2004.09.040

Engineering of a synthetic hydF–hydE–hydG–hydA operon for biohydrogen production
journal, February 2008


A method of matrix analysis of group structure
journal, June 1949

  • Luce, R. Duncan; Perry, Albert D.
  • Psychometrika, Vol. 14, Issue 2
  • DOI: 10.1007/BF02289146

Topological structure analysis of the protein-protein interaction network in budding yeast
journal, May 2003


A graph‐theoretic generalization of the clique concept*
journal, January 1978


Dense subgraph computation via stochastic search: application to detect transcriptional modules
journal, July 2006


Biological hydrogen production by Clostridium acetobutylicum in an unsaturated flow reactor
journal, February 2006


Phosphotransbutyrylase from Clostridium acetobutylicum ATCC 824 and its role in acidogenesis.
journal, January 1989

  • Wiesenborn, D. P.; Rudolph, F. B.; Papoutsakis, E. T.
  • Applied and Environmental Microbiology, Vol. 55, Issue 2
  • DOI: 10.1128/aem.55.2.317-322.1989

Extended clique initialisation in examination timetabling
journal, May 2001


An in silico method for detecting overlapping functional modules from composite biological networks
journal, November 2008

  • Maraziotis, Ioannis A.; Dimitrakopoulou, Konstantina; Bezerianos, Anastasios
  • BMC Systems Biology, Vol. 2, Issue 1
  • DOI: 10.1186/1752-0509-2-93

Intermediary Metabolism in Clostridium acetobutylicum : Levels of Enzymes Involved in the Formation of Acetate and Butyrate
journal, June 1984


Low-complexity fuzzy relational clustering algorithms for Web mining
journal, January 2001

  • Krishnapuram, R.; Joshi, A.; Nasraoui, O.
  • IEEE Transactions on Fuzzy Systems, Vol. 9, Issue 4
  • DOI: 10.1109/91.940971

Core and periphery structures in protein interaction networks
journal, April 2009


Genome-scale reconstruction and in silico analysis of the Clostridium acetobutylicum ATCC 824 metabolic network
journal, August 2008

  • Lee, Joungmin; Yun, Hongseok; Feist, Adam M.
  • Applied Microbiology and Biotechnology, Vol. 80, Issue 5
  • DOI: 10.1007/s00253-008-1654-4

Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007

  • Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
  • Journal of Molecular Biology, Vol. 372, Issue 4
  • DOI: 10.1016/j.jmb.2007.07.023

STRING 8--a global view on proteins and their functional interactions in 630 organisms
journal, January 2009

  • Jensen, L. J.; Kuhn, M.; Stark, M.
  • Nucleic Acids Research, Vol. 37, Issue Database
  • DOI: 10.1093/nar/gkn760

Out-of-core coherent closed quasi-clique mining from large dense graph databases
journal, June 2007

  • Zeng, Zhiping; Wang, Jianyong; Zhou, Lizhu
  • ACM Transactions on Database Systems, Vol. 32, Issue 2
  • DOI: 10.1145/1242524.1242530

Cross-talk Between Iron and Nitrogen Regulatory Networks in Anabaena (Nostoc) sp. PCC 7120: Identification of Overlapping Genes in FurA and NtcA Regulons
journal, November 2007

  • López-Gomollón, Sara; Hernández, José A.; Pellicer, Silvia
  • Journal of Molecular Biology, Vol. 374, Issue 1
  • DOI: 10.1016/j.jmb.2007.09.010

Natural Document Clustering by Clique Percolation in Random Graphs
book, January 2006


Classification and phylogeny of hydrogenases
journal, August 2001


Detecting functional modules in the yeast protein–protein interaction network
journal, July 2006


Metabolic pathway engineering for enhanced biohydrogen production
journal, September 2009


Nutritional Factors Affecting the Ratio of Solvents Produced by Clostridium acetobutylicum
journal, January 1986


Genome-scale reconstruction and in silico analysis of the Clostridium acetobutylicum ATCC 824 metabolic network
journal, August 2008

  • Lee, Joungmin; Yun, Hongseok; Feist, Adam M.
  • Applied Microbiology and Biotechnology, Vol. 80, Issue 5
  • DOI: 10.1007/s00253-008-1654-4

Engineering of a synthetic hydF–hydE–hydG–hydA operon for biohydrogen production
journal, February 2008


Metabolic pathway engineering for enhanced biohydrogen production
journal, September 2009


The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004

  • Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
  • Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
  • DOI: 10.1016/j.jmb.2004.09.040

Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007

  • Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
  • Journal of Molecular Biology, Vol. 372, Issue 4
  • DOI: 10.1016/j.jmb.2007.07.023

Cross-talk Between Iron and Nitrogen Regulatory Networks in Anabaena (Nostoc) sp. PCC 7120: Identification of Overlapping Genes in FurA and NtcA Regulons
journal, November 2007

  • López-Gomollón, Sara; Hernández, José A.; Pellicer, Silvia
  • Journal of Molecular Biology, Vol. 374, Issue 1
  • DOI: 10.1016/j.jmb.2007.09.010

Biological hydrogen production by Clostridium acetobutylicum in an unsaturated flow reactor
journal, February 2006


Phosphoheptose Isomerase, First Enzyme in the Biosynthesis of Aldoheptose in Salmonella typhimurium
journal, September 1974


Detecting functional modules in the yeast protein–protein interaction network
journal, July 2006


Topological structure analysis of the protein-protein interaction network in budding yeast
journal, May 2003


Low-complexity fuzzy relational clustering algorithms for Web mining
journal, January 2001

  • Krishnapuram, R.; Joshi, A.; Nasraoui, O.
  • IEEE Transactions on Fuzzy Systems, Vol. 9, Issue 4
  • DOI: 10.1109/91.940971

An Algorithm for the Discovery of Phenotype Related Metabolic Pathways
conference, November 2009

  • Schmidt, Matthew C.; Samatova, Nagiza F.
  • 2009 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
  • DOI: 10.1109/bibm.2009.78

Consensus Clustering for Detection of Overlapping Clusters in Microarray Data
conference, December 2006

  • Deodhar, Meghana; Ghosh, Joydeep
  • Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
  • DOI: 10.1109/icdmw.2006.50

Succession of the Bacterial Community and Dynamics of Hydrogen Producers in a Hydrogen-Producing Bioreactor
journal, March 2010

  • Huang, Yue; Zong, Wenming; Yan, Xing
  • Applied and Environmental Microbiology, Vol. 76, Issue 10
  • DOI: 10.1128/aem.02444-09

Nutritional Factors Affecting the Ratio of Solvents Produced by Clostridium acetobutylicum
journal, January 1986


Phosphotransbutyrylase from Clostridium acetobutylicum ATCC 824 and its role in acidogenesis.
journal, January 1989

  • Wiesenborn, D. P.; Rudolph, F. B.; Papoutsakis, E. T.
  • Applied and Environmental Microbiology, Vol. 55, Issue 2
  • DOI: 10.1128/aem.55.2.317-322.1989

Acid- and Base-Induced Proteins during Aerobic and Anaerobic Growth of Escherichia coli Revealed by Two-Dimensional Gel Electrophoresis
journal, April 1999


Adaptive Acid Tolerance Response of Streptococcus sobrinus
journal, October 2004


Out-of-core coherent closed quasi-clique mining from large dense graph databases
journal, June 2007

  • Zeng, Zhiping; Wang, Jianyong; Zhou, Lizhu
  • ACM Transactions on Database Systems, Vol. 32, Issue 2
  • DOI: 10.1145/1242524.1242530

Protein subcellular localization prediction of eukaryotes using a knowledge-based approach
journal, December 2009


An in silico method for detecting overlapping functional modules from composite biological networks
journal, November 2008

  • Maraziotis, Ioannis A.; Dimitrakopoulou, Konstantina; Bezerianos, Anastasios
  • BMC Systems Biology, Vol. 2, Issue 1
  • DOI: 10.1186/1752-0509-2-93

The Evolution of Random Graphs
journal, November 1984

  • Bollobas, Bela
  • Transactions of the American Mathematical Society, Vol. 286, Issue 1
  • DOI: 10.2307/1999405

STRING 8--a global view on proteins and their functional interactions in 630 organisms
text, January 2009


Works referencing / citing this record:

Quantitative assessment of gene expression network module-validation methods
journal, October 2015

  • Li, Bing; Zhang, Yingying; Yu, Yanan
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep15258

Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack
journal, January 2013


Quantitative assessment of gene expression network module-validation methods
journal, October 2015

  • Li, Bing; Zhang, Yingying; Yu, Yanan
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep15258

In-silico identification of phenotype-biased functional modules
journal, June 2012


Spice: discovery of phenotype-determining component interplays
journal, May 2012

  • Chen, Zhengzhang; Padmanabhan, Kanchana; Rocha, Andrea M.
  • BMC Systems Biology, Vol. 6, Issue 1
  • DOI: 10.1186/1752-0509-6-40