DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules
Abstract
Background: Identifying cellular subsystems that are involved in the expression of a target phenotype has been a very active research area for the past several years. In this paper, cellular subsystem refers to a group of genes (or proteins) that interact and carry out a common function in the cell. Most studies identify genes associated with a phenotype on the basis of some statistical bias, others have extended these statistical methods to analyze functional modules and biological pathways for phenotype-relatedness. However, a biologist might often have a specific question in mind while performing such analysis and most of the resulting subsystems obtained by the existing methods might be largely irrelevant to the question in hand. Arguably, it would be valuable to incorporate biologist’s knowledge about the phenotype into the algorithm. This way, it is anticipated that the resulting subsytems would not only be related to the target phenotype but also contain information that the biologist is likely to be interested in. Results: In this paper we introduce a fast and theoretically guranteed method called DENSE (Dense and ENriched Subgraph Enumeration) that can take in as input a biologist’s prior knowledge as a set of query proteins and identify all themore »
- Authors:
-
- North Carolina State Univ., Raleigh, NC (United States). Dept. of Computer Science; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
- Univ. of South Florida, Tampa, FL (United States). Dept. of Civil and Environment Engineering
- Northwestern Univ., Evanston, IL (United States). Dept. of Electrical Engineering and Computer Science
- Univ. of South Florida, Tampa, FL (United States). Dept. of Integrative Biology
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States); Northwestern Univ., Evanston, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF)
- OSTI Identifier:
- 1626645
- Grant/Contract Number:
- AC05-00OR22725; FC02-07ER25808; FG02-08ER25848; SC0001283; SC0005309; SC0005340; OCI-0724599; CNS-0830927; CCF-0621443; CCF-0833131; CCF-0938000; CCF-1029166; CCF-1043085
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Systems Biology
- Additional Journal Information:
- Journal Volume: 5; Journal Issue: 1; Journal ID: ISSN 1752-0509
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 62 RADIOLOGY AND NUCLEAR MEDICINE; 97 MATHEMATICS AND COMPUTING; Mathematical & computational biology; Hydrogen Production; Dense Subgraph; Pyruvate Formate Lyase; Bitmap Index; Target Phenotype
Citation Formats
Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules. United States: N. p., 2011.
Web. doi:10.1186/1752-0509-5-172.
Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., & Samatova, Nagiza F. DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules. United States. https://doi.org/10.1186/1752-0509-5-172
Hendrix, Willam, Rocha, Andrea M., Padmanabhan, Kanchana, Choudhary, Alok, Scott, Kathleen, Mihelcic, James R., and Samatova, Nagiza F. Mon .
"DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules". United States. https://doi.org/10.1186/1752-0509-5-172. https://www.osti.gov/servlets/purl/1626645.
@article{osti_1626645,
title = {DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules},
author = {Hendrix, Willam and Rocha, Andrea M. and Padmanabhan, Kanchana and Choudhary, Alok and Scott, Kathleen and Mihelcic, James R. and Samatova, Nagiza F.},
abstractNote = {Background: Identifying cellular subsystems that are involved in the expression of a target phenotype has been a very active research area for the past several years. In this paper, cellular subsystem refers to a group of genes (or proteins) that interact and carry out a common function in the cell. Most studies identify genes associated with a phenotype on the basis of some statistical bias, others have extended these statistical methods to analyze functional modules and biological pathways for phenotype-relatedness. However, a biologist might often have a specific question in mind while performing such analysis and most of the resulting subsystems obtained by the existing methods might be largely irrelevant to the question in hand. Arguably, it would be valuable to incorporate biologist’s knowledge about the phenotype into the algorithm. This way, it is anticipated that the resulting subsytems would not only be related to the target phenotype but also contain information that the biologist is likely to be interested in. Results: In this paper we introduce a fast and theoretically guranteed method called DENSE (Dense and ENriched Subgraph Enumeration) that can take in as input a biologist’s prior knowledge as a set of query proteins and identify all the dense functional modules in a biological network that contain some part of the query vertices. The density (in terms of the number of network egdes) and the enrichment (the number of query proteins in the resulting functional module) can be manipulated via two parameters g and μ, respectively. Conclusion: This algorithm has been applied to the protein functional association network of Clostridium acetobutylicum ATCC 824, a hydrogen producing, acid-tolerant organism. The algorithm was able to verify relationships known to exist in literature and also some previously unknown relationships including those with regulatory and signaling functions. Additionally, we were also able to hypothesize that some uncharacterized proteins are likely associated with the target phenotype. The DENSE code can be downloaded from http://www. freescience.org/cs/DENSE/},
doi = {10.1186/1752-0509-5-172},
journal = {BMC Systems Biology},
number = 1,
volume = 5,
place = {United States},
year = {Mon Jan 24 00:00:00 EST 2011},
month = {Mon Jan 24 00:00:00 EST 2011}
}
Works referenced in this record:
Redirection of Metabolism for Biological Hydrogen Production
journal, January 2007
- Rey, F. E.; Heiniger, E. K.; Harwood, C. S.
- Applied and Environmental Microbiology, Vol. 73, Issue 5
The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004
- Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
- Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
Engineering of a synthetic hydF–hydE–hydG–hydA operon for biohydrogen production
journal, February 2008
- Akhtar, M. Kalim; Jones, Patrik R.
- Analytical Biochemistry, Vol. 373, Issue 1
A method of matrix analysis of group structure
journal, June 1949
- Luce, R. Duncan; Perry, Albert D.
- Psychometrika, Vol. 14, Issue 2
Topological structure analysis of the protein-protein interaction network in budding yeast
journal, May 2003
- Bu, D.
- Nucleic Acids Research, Vol. 31, Issue 9
A graph‐theoretic generalization of the clique concept*
journal, January 1978
- Seidman, Stephen B.; Foster, Brian L.
- The Journal of Mathematical Sociology, Vol. 6, Issue 1
Dense subgraph computation via stochastic search: application to detect transcriptional modules
journal, July 2006
- Everett, L.; Wang, L. -S.; Hannenhalli, S.
- Bioinformatics, Vol. 22, Issue 14
Molecular characterization of the genes encoding pyruvate formate-lyase and its activating enzyme of Clostridium pasteurianum.
journal, January 1996
- Weidner, G.; Sawers, G.
- Journal of bacteriology, Vol. 178, Issue 8
Biological hydrogen production by Clostridium acetobutylicum in an unsaturated flow reactor
journal, February 2006
- Zhang, Husen; Bruns, Mary Ann; Logan, Bruce E.
- Water Research, Vol. 40, Issue 4
Phosphotransbutyrylase from Clostridium acetobutylicum ATCC 824 and its role in acidogenesis.
journal, January 1989
- Wiesenborn, D. P.; Rudolph, F. B.; Papoutsakis, E. T.
- Applied and Environmental Microbiology, Vol. 55, Issue 2
Extended clique initialisation in examination timetabling
journal, May 2001
- Carter, M. W.; Johnson, D. G.
- Journal of the Operational Research Society, Vol. 52, Issue 5
An in silico method for detecting overlapping functional modules from composite biological networks
journal, November 2008
- Maraziotis, Ioannis A.; Dimitrakopoulou, Konstantina; Bezerianos, Anastasios
- BMC Systems Biology, Vol. 2, Issue 1
Intermediary Metabolism in Clostridium acetobutylicum : Levels of Enzymes Involved in the Formation of Acetate and Butyrate
journal, June 1984
- Hartmanis, Maris G. N.; Gatenbeck, Sten
- Applied and Environmental Microbiology, Vol. 47, Issue 6
Low-complexity fuzzy relational clustering algorithms for Web mining
journal, January 2001
- Krishnapuram, R.; Joshi, A.; Nasraoui, O.
- IEEE Transactions on Fuzzy Systems, Vol. 9, Issue 4
Core and periphery structures in protein interaction networks
journal, April 2009
- Luo, Feng; Li, Bo; Wan, Xiu-Feng
- BMC Bioinformatics, Vol. 10, Issue S4
Genome-scale reconstruction and in silico analysis of the Clostridium acetobutylicum ATCC 824 metabolic network
journal, August 2008
- Lee, Joungmin; Yun, Hongseok; Feist, Adam M.
- Applied Microbiology and Biotechnology, Vol. 80, Issue 5
Novel pathways for biosynthesis of nucleotide-activated glycero-manno-heptose precursors of bacterial glycoproteins and cell surface polysaccharides
journal, July 2002
- Messner, Paul; Kosma, Paul; Valvano, Miguel A.
- Microbiology, Vol. 148, Issue 7
Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007
- Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
- Journal of Molecular Biology, Vol. 372, Issue 4
Prediction of functional modules based on comparative genome analysis and Gene Ontology application
journal, May 2005
- Wu, H.
- Nucleic Acids Research, Vol. 33, Issue 9
STRING 8--a global view on proteins and their functional interactions in 630 organisms
journal, January 2009
- Jensen, L. J.; Kuhn, M.; Stark, M.
- Nucleic Acids Research, Vol. 37, Issue Database
Out-of-core coherent closed quasi-clique mining from large dense graph databases
journal, June 2007
- Zeng, Zhiping; Wang, Jianyong; Zhou, Lizhu
- ACM Transactions on Database Systems, Vol. 32, Issue 2
Cross-talk Between Iron and Nitrogen Regulatory Networks in Anabaena (Nostoc) sp. PCC 7120: Identification of Overlapping Genes in FurA and NtcA Regulons
journal, November 2007
- López-Gomollón, Sara; Hernández, José A.; Pellicer, Silvia
- Journal of Molecular Biology, Vol. 374, Issue 1
Natural Document Clustering by Clique Percolation in Random Graphs
book, January 2006
- Gao, Wei; Wong, Kam-Fai
- Information Retrieval Technology
Classification and phylogeny of hydrogenases
journal, August 2001
- Vignais, Paulette M.; Billoud, Bernard; Meyer, Jacques
- FEMS Microbiology Reviews, Vol. 25, Issue 4, p. 455-501
Detecting functional modules in the yeast protein–protein interaction network
journal, July 2006
- Chen, Jingchun; Yuan, Bo
- Bioinformatics, Vol. 22, Issue 18
Cross-talk between the L-sorbose and D-sorbitol (D-glucitol) metabolic pathways in Lactobacillus casei a aThe GenBank accession number for the sequence reported in this paper is AF396831.
journal, August 2002
- Yebra, Marı́a J.; Pérez-Martı́nez, Gaspar
- Microbiology, Vol. 148, Issue 8
Metabolic pathway engineering for enhanced biohydrogen production
journal, September 2009
- Mathews, Juanita; Wang, Guangyi
- International Journal of Hydrogen Energy, Vol. 34, Issue 17
Nutritional Factors Affecting the Ratio of Solvents Produced by Clostridium acetobutylicum
journal, January 1986
- Bahl, H.; Gottwald, M.; Kuhn, A.
- Applied and Environmental Microbiology, Vol. 52, Issue 1
Metabolite stress and tolerance in the production of biofuels and chemicals: Gene-expression-based systems analysis of butanol, butyrate, and acetate stresses in the anaerobe Clostridium acetobutylicum
journal, January 2010
- Alsaker, Keith V.; Paredes, Carlos; Papoutsakis, Eleftherios T.
- Biotechnology and Bioengineering
Genome-scale reconstruction and in silico analysis of the Clostridium acetobutylicum ATCC 824 metabolic network
journal, August 2008
- Lee, Joungmin; Yun, Hongseok; Feist, Adam M.
- Applied Microbiology and Biotechnology, Vol. 80, Issue 5
Engineering of a synthetic hydF–hydE–hydG–hydA operon for biohydrogen production
journal, February 2008
- Akhtar, M. Kalim; Jones, Patrik R.
- Analytical Biochemistry, Vol. 373, Issue 1
Metabolic pathway engineering for enhanced biohydrogen production
journal, September 2009
- Mathews, Juanita; Wang, Guangyi
- International Journal of Hydrogen Energy, Vol. 34, Issue 17
The Complex Between Hydrogenase-maturation Proteins HypC and HypD is an Intermediate in the Supply of Cyanide to the Active Site Iron of [NiFe]-Hydrogenases
journal, November 2004
- Blokesch, Melanie; Albracht, Simon P. J.; Matzanke, Berthold F.
- Journal of Molecular Biology, Vol. 344, Issue 1, p. 155-167
Crystal Structures of Hydrogenase Maturation Protein HypE in the Apo and ATP-bound Forms
journal, September 2007
- Shomura, Yasuhito; Komori, Hirofumi; Miyabe, Natsuko
- Journal of Molecular Biology, Vol. 372, Issue 4
Cross-talk Between Iron and Nitrogen Regulatory Networks in Anabaena (Nostoc) sp. PCC 7120: Identification of Overlapping Genes in FurA and NtcA Regulons
journal, November 2007
- López-Gomollón, Sara; Hernández, José A.; Pellicer, Silvia
- Journal of Molecular Biology, Vol. 374, Issue 1
Biological hydrogen production by Clostridium acetobutylicum in an unsaturated flow reactor
journal, February 2006
- Zhang, Husen; Bruns, Mary Ann; Logan, Bruce E.
- Water Research, Vol. 40, Issue 4
Phosphoheptose Isomerase, First Enzyme in the Biosynthesis of Aldoheptose in Salmonella typhimurium
journal, September 1974
- Eidels, Leon; Osborn, M. J.
- Journal of Biological Chemistry, Vol. 249, Issue 17
Detecting functional modules in the yeast protein–protein interaction network
journal, July 2006
- Chen, Jingchun; Yuan, Bo
- Bioinformatics, Vol. 22, Issue 18
Topological structure analysis of the protein-protein interaction network in budding yeast
journal, May 2003
- Bu, D.
- Nucleic Acids Research, Vol. 31, Issue 9
Prediction of functional modules based on comparative genome analysis and Gene Ontology application
journal, May 2005
- Wu, H.
- Nucleic Acids Research, Vol. 33, Issue 9
Novel pathways for biosynthesis of nucleotide-activated glycero-manno-heptose precursors of bacterial glycoproteins and cell surface polysaccharides
journal, July 2002
- Messner, Paul; Kosma, Paul; Valvano, Miguel A.
- Microbiology, Vol. 148, Issue 7
Low-complexity fuzzy relational clustering algorithms for Web mining
journal, January 2001
- Krishnapuram, R.; Joshi, A.; Nasraoui, O.
- IEEE Transactions on Fuzzy Systems, Vol. 9, Issue 4
An Algorithm for the Discovery of Phenotype Related Metabolic Pathways
conference, November 2009
- Schmidt, Matthew C.; Samatova, Nagiza F.
- 2009 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Consensus Clustering for Detection of Overlapping Clusters in Microarray Data
conference, December 2006
- Deodhar, Meghana; Ghosh, Joydeep
- Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
Succession of the Bacterial Community and Dynamics of Hydrogen Producers in a Hydrogen-Producing Bioreactor
journal, March 2010
- Huang, Yue; Zong, Wenming; Yan, Xing
- Applied and Environmental Microbiology, Vol. 76, Issue 10
Nutritional Factors Affecting the Ratio of Solvents Produced by Clostridium acetobutylicum
journal, January 1986
- Bahl, H.; Gottwald, M.; Kuhn, A.
- Applied and Environmental Microbiology, Vol. 52, Issue 1
Phosphotransbutyrylase from Clostridium acetobutylicum ATCC 824 and its role in acidogenesis.
journal, January 1989
- Wiesenborn, D. P.; Rudolph, F. B.; Papoutsakis, E. T.
- Applied and Environmental Microbiology, Vol. 55, Issue 2
Molecular characterization of the genes encoding pyruvate formate-lyase and its activating enzyme of Clostridium pasteurianum.
journal, January 1996
- Weidner, G.; Sawers, G.
- Journal of bacteriology, Vol. 178, Issue 8
Acid- and Base-Induced Proteins during Aerobic and Anaerobic Growth of Escherichia coli Revealed by Two-Dimensional Gel Electrophoresis
journal, April 1999
- Blankenhorn, Darcy; Phillips, Judith; Slonczewski, Joan L.
- Journal of Bacteriology, Vol. 181, Issue 7
Adaptive Acid Tolerance Response of Streptococcus sobrinus
journal, October 2004
- Nascimento, Marcelle M.; Lemos, José A. C.; Abranches, Jacqueline
- Journal of Bacteriology, Vol. 186, Issue 19
Out-of-core coherent closed quasi-clique mining from large dense graph databases
journal, June 2007
- Zeng, Zhiping; Wang, Jianyong; Zhou, Lizhu
- ACM Transactions on Database Systems, Vol. 32, Issue 2
Protein subcellular localization prediction of eukaryotes using a knowledge-based approach
journal, December 2009
- Lin, Hsin-Nan; Chen, Ching-Tai; Sung, Ting-Yi
- BMC Bioinformatics, Vol. 10, Issue S15
An in silico method for detecting overlapping functional modules from composite biological networks
journal, November 2008
- Maraziotis, Ioannis A.; Dimitrakopoulou, Konstantina; Bezerianos, Anastasios
- BMC Systems Biology, Vol. 2, Issue 1
The Evolution of Random Graphs
journal, November 1984
- Bollobas, Bela
- Transactions of the American Mathematical Society, Vol. 286, Issue 1
STRING 8--a global view on proteins and their functional interactions in 630 organisms
text, January 2009
- Jensen, L. J.; Kuhn, M.; Stark, M.
- Oxford University Press
Works referencing / citing this record:
Quantitative assessment of gene expression network module-validation methods
journal, October 2015
- Li, Bing; Zhang, Yingying; Yu, Yanan
- Scientific Reports, Vol. 5, Issue 1
Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack
journal, January 2013
- Atluri, Gowtham; Padmanabhan, Kanchana; Fang, Gang
- NeuroImage: Clinical, Vol. 3
Quantitative assessment of gene expression network module-validation methods
journal, October 2015
- Li, Bing; Zhang, Yingying; Yu, Yanan
- Scientific Reports, Vol. 5, Issue 1
In-silico identification of phenotype-biased functional modules
journal, June 2012
- Padmanabhan, Kanchana; Wilson, Kevin; Rocha, Andrea M.
- Proteome Science, Vol. 10, Issue S1
Spice: discovery of phenotype-determining component interplays
journal, May 2012
- Chen, Zhengzhang; Padmanabhan, Kanchana; Rocha, Andrea M.
- BMC Systems Biology, Vol. 6, Issue 1