Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins
Abstract
Here, the existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at themore »
- Authors:
-
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 3783, USA
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER)
- OSTI Identifier:
- 1423699
- Alternate Identifier(s):
- OSTI ID: 1468038
- Grant/Contract Number:
- SC0008834; AC05-00OR22725
- Resource Type:
- Published Article
- Journal Name:
- International Journal of Genomics
- Additional Journal Information:
- Journal Name: International Journal of Genomics Journal Volume: 2018; Journal ID: ISSN 2314-436X
- Publisher:
- Hindawi Publishing Corporation
- Country of Publication:
- Egypt
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES
Citation Formats
Guo, Hao-Bo, Ma, Yue, Tuskan, Gerald A., Yang, Xiaohan, and Guo, Hong. Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins. Egypt: N. p., 2018.
Web. doi:10.1155/2018/9784161.
Guo, Hao-Bo, Ma, Yue, Tuskan, Gerald A., Yang, Xiaohan, & Guo, Hong. Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins. Egypt. https://doi.org/10.1155/2018/9784161
Guo, Hao-Bo, Ma, Yue, Tuskan, Gerald A., Yang, Xiaohan, and Guo, Hong. Mon .
"Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins". Egypt. https://doi.org/10.1155/2018/9784161.
@article{osti_1423699,
title = {Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins},
author = {Guo, Hao-Bo and Ma, Yue and Tuskan, Gerald A. and Yang, Xiaohan and Guo, Hong},
abstractNote = {Here, the existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.},
doi = {10.1155/2018/9784161},
journal = {International Journal of Genomics},
number = ,
volume = 2018,
place = {Egypt},
year = {Mon Jan 01 00:00:00 EST 2018},
month = {Mon Jan 01 00:00:00 EST 2018}
}
https://doi.org/10.1155/2018/9784161
Web of Science
Works referenced in this record:
Therapeutic Interventions of Cancers Using Intrinsically Disordered Proteins as Drug Targets: c-Myc as Model System
journal, January 2017
- Kumar, Deepak; Sharma, Nitin; Giri, Rajanish
- Cancer Informatics, Vol. 16
The relationship between proteome size, structural disorder and organism complexity
journal, January 2011
- Schad, Eva; Tompa, Peter; Hegyi, Hedi
- Genome Biology, Vol. 12, Issue 12
Intrinsically disordered proteins are potential drug targets
journal, August 2010
- Metallo, Steven J.
- Current Opinion in Chemical Biology, Vol. 14, Issue 4
Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life
journal, June 2012
- Xue, Bin; Dunker, A. Keith; Uversky, Vladimir N.
- Journal of Biomolecular Structure and Dynamics, Vol. 30, Issue 2
Intrinsically disordered proteins: emerging interaction specialists
journal, December 2015
- Tompa, Peter; Schad, Eva; Tantos, Agnes
- Current Opinion in Structural Biology, Vol. 35
The Sequence of the Human Genome
journal, February 2001
- Venter, J. Craig; Adams, Mark D.; Myers, Eugene W.
- Science, Vol. 291, Issue 5507
A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica)
journal, April 2002
- Yu, Jun
- Science, Vol. 296, Issue 5565, p. 79-92
A Eukaryote without a Mitochondrial Organelle
journal, May 2016
- Karnkowska, Anna; Vacek, Vojtěch; Zubáčová, Zuzana
- Current Biology, Vol. 26, Issue 10
Intrinsically disordered proteins in cellular signalling and regulation
journal, December 2014
- Wright, Peter E.; Dyson, H. Jane
- Nature Reviews Molecular Cell Biology, Vol. 16, Issue 1
Protein disorder in the human diseasome: unfoldomics of human genetic diseases
journal, January 2009
- Midic, Uros; Oldfield, Christopher J.; Dunker, A. Keith
- BMC Genomics, Vol. 10, Issue Suppl 1
The Amborella Genome and the Evolution of Flowering Plants
journal, December 2013
- Albert, V. A.; Barbazuk, W. B.; dePamphilis, C. W.
- Science, Vol. 342, Issue 6165
Protein length in eukaryotic and prokaryotic proteomes
journal, June 2005
- Brocchieri, L.
- Nucleic Acids Research, Vol. 33, Issue 10
Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya.
journal, June 1990
- Woese, C. R.; Kandler, O.; Wheelis, M. L.
- Proceedings of the National Academy of Sciences, Vol. 87, Issue 12
The 1.2-Megabase Genome Sequence of Mimivirus
journal, November 2004
- Raoult, D.
- Science, Vol. 306, Issue 5700
The pineapple genome and the evolution of CAM photosynthesis
journal, November 2015
- Ming, Ray; VanBuren, Robert; Wai, Ching Man
- Nature Genetics, Vol. 47, Issue 12
Understanding protein non-folding
journal, June 2010
- Uversky, Vladimir N.; Dunker, A. Keith
- Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, Vol. 1804, Issue 6
Intrinsically disordered proteins and multicellular organisms
journal, January 2015
- Dunker, A. Keith; Bondos, Sarah E.; Huang, Fei
- Seminars in Cell & Developmental Biology, Vol. 37
The essential gene set of a photosynthetic organism
journal, October 2015
- Rubin, Benjamin E.; Wetmore, Kelly M.; Price, Morgan N.
- Proceedings of the National Academy of Sciences, Vol. 112, Issue 48
How Common Is Disorder? Occurrence of Disordered Residues in Four Domains of Life
journal, August 2015
- Lobanov, Mikhail; Galzitskaya, Oxana
- International Journal of Molecular Sciences, Vol. 16, Issue 8
Intrinsic Disorder in Cell-signaling and Cancer-associated Proteins
journal, October 2002
- Iakoucheva, Lilia M.; Brown, Celeste J.; Lawson, J. David
- Journal of Molecular Biology, Vol. 323, Issue 3
The Physcomitrella Genome Reveals Evolutionary Insights into the Conquest of Land by Plants
journal, December 2007
- Rensing, S. A.; Lang, D.; Zimmer, A. D.
- Science, Vol. 319, Issue 5859
The Complete Genome Sequence of Escherichia coli K-12
journal, September 1997
- Blattner, F. R.
- Science, Vol. 277, Issue 5331
A structural phylogenetic map for chloroplast photosynthesis
journal, December 2011
- Allen, John F.; de Paula, Wilson B. M.; Puthiyaveetil, Sujith
- Trends in Plant Science, Vol. 16, Issue 12
Giant viruses come of age
journal, June 2016
- Fischer, Matthias G.
- Current Opinion in Microbiology, Vol. 31
Mitochondrial Gene Expression: A Playground of Evolutionary Tinkering
journal, June 2016
- Neupert, Walter
- Annual Review of Biochemistry, Vol. 85, Issue 1
Ten good reasons not to exclude giruses from the evolutionary picture
journal, August 2009
- Claverie, Jean-Michel; Ogata, Hiroyuki
- Nature Reviews Microbiology, Vol. 7, Issue 8
Why chloroplasts and mitochondria retain their own genomes and genetic systems: Colocation for redox regulation of gene expression
journal, May 2015
- Allen, John F.
- Proceedings of the National Academy of Sciences, Vol. 112, Issue 33
Functional advantages of dynamic protein disorder
journal, June 2015
- Berlow, Rebecca B.; Dyson, H. Jane; Wright, Peter E.
- FEBS Letters, Vol. 589, Issue 19PartA
Targeting intrinsically disordered proteins in neurodegenerative and protein dysfunction diseases: another illustration of the D 2 concept
journal, August 2010
- Uversky, Vladimir N.
- Expert Review of Proteomics, Vol. 7, Issue 4
Identification of Inhibitors of Biological Interactions Involving Intrinsically Disordered Proteins
journal, April 2015
- Marasco, Daniela; Scognamiglio, Pasqualina
- International Journal of Molecular Sciences, Vol. 16, Issue 12
The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray)
journal, September 2006
- Tuskan, G. A.; DiFazio, S.; Jansson, S.
- Science, Vol. 313, Issue 5793, p. 1596-1604
A decade and a half of protein intrinsic disorder: Biology still waits for physics: Protein Intrinsic Disorder
journal, April 2013
- Uversky, Vladimir N.
- Protein Science, Vol. 22, Issue 6
The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions
journal, October 2007
- Merchant, S. S.; Prochnik, S. E.; Vallon, O.
- Science, Vol. 318, Issue 5848, p. 245-250
Theoretical Perspectives on Protein Folding
journal, April 2010
- Thirumalai, D.; O'Brien, Edward P.; Morrison, Greg
- Annual Review of Biophysics, Vol. 39, Issue 1
Protein-length distributions for the three domains of life
journal, March 2000
- Zhang, Jianzhi
- Trends in Genetics, Vol. 16, Issue 3
Are viruses alive? The replicator paradigm sheds decisive light on an old but misguided question
journal, October 2016
- Koonin, Eugene V.; Starokadomskyy, Petro
- Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, Vol. 59
GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis
journal, January 2009
- Aurrecoechea, C.; Brestelli, J.; Brunk, B. P.
- Nucleic Acids Research, Vol. 37, Issue Database
Pathogen to powerhouse
journal, February 2016
- Ball, S. G.; Bhattacharya, D.; Weber, A. P. M.
- Science, Vol. 351, Issue 6274
The multifaceted roles of intrinsic disorder in protein complexes
journal, June 2015
- Uversky, Vladimir N.
- FEBS Letters, Vol. 589, Issue 19PartA
To be or not to be alive: How recent discoveries challenge the traditional definitions of viruses and life
journal, October 2016
- Forterre, Patrick
- Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, Vol. 59
An integrated phylogenomic approach toward pinpointing the origin of mitochondria
journal, January 2015
- Wang, Zhang; Wu, Martin
- Scientific Reports, Vol. 5, Issue 1
Genome-Wide Analysis of Protein Disorder in Arabidopsis thaliana: Implications for Plant Environmental Adaptation
journal, February 2013
- Pietrosemoli, Natalia; García-Martín, Juan A.; Solano, Roberto
- PLoS ONE, Vol. 8, Issue 2
The p53 Pathway: Origins, Inactivation in Cancer, and Emerging Therapeutic Approaches
journal, June 2016
- Joerger, Andreas C.; Fersht, Alan R.
- Annual Review of Biochemistry, Vol. 85, Issue 1
Targeting intrinsically disordered proteins in rational drug discovery
journal, November 2015
- Ambadipudi, Susmitha; Zweckstetter, Markus
- Expert Opinion on Drug Discovery, Vol. 11, Issue 1
Giant viruses and the origin of modern eukaryotes
journal, June 2016
- Forterre, Patrick; Gaïa, Morgan
- Current Opinion in Microbiology, Vol. 31
Physical limits of cells and proteomes
journal, October 2011
- Dill, K. A.; Ghosh, K.; Schmit, J. D.
- Proceedings of the National Academy of Sciences, Vol. 108, Issue 44
Highly Disordered Proteins in Prostate Cancer
journal, February 2017
- Uversky, Vladimir; Na, Insung; Landau, Kevin
- Current Protein & Peptide Science, Vol. 18, Issue 5
T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks
journal, June 2012
- Boc, A.; Diallo, A. B.; Makarenkov, V.
- Nucleic Acids Research, Vol. 40, Issue W1
Protein disorder in plants: a view from the chloroplast
journal, January 2012
- Yruela, Inmaculada; Contreras-Moreira, Bruno
- BMC Plant Biology, Vol. 12, Issue 1
Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life
journal, June 2014
- Peng, Zhenling; Yan, Jing; Fan, Xiao
- Cellular and Molecular Life Sciences, Vol. 72, Issue 1
Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention
journal, February 2016
- Johnston, Iain G.; Williams, Ben P.
- Cell Systems, Vol. 2, Issue 2
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods
journal, May 2011
- Tamura, K.; Peterson, D.; Peterson, N.
- Molecular Biology and Evolution, Vol. 28, Issue 10
Drugging Undruggable Molecular Cancer Targets
journal, January 2016
- Lazo, John S.; Sharlow, Elizabeth R.
- Annual Review of Pharmacology and Toxicology, Vol. 56, Issue 1
Ten reasons to exclude viruses from the tree of life
journal, March 2009
- Moreira, David; López-García, Purificación
- Nature Reviews Microbiology, Vol. 7, Issue 4
A High-Resolution Radiation Hybrid Map of the Human Genome Draft Sequence
journal, February 2001
- Olivier, Michael; Aggarwal, Amita; Allen, Jennifer
- Science, Vol. 291, Issue 5507
Pandoraviruses: Amoeba Viruses with Genomes Up to 2.5 Mb Reaching That of Parasitic Eukaryotes
journal, July 2013
- Philippe, N.; Legendre, M.; Doutre, G.
- Science, Vol. 341, Issue 6143
Pathological Unfoldomics of Uncontrolled Chaos: Intrinsically Disordered Proteins and Human Diseases
journal, December 2013
- Uversky, Vladimir N.; Davé, Vrushank; Iakoucheva, Lilia M.
- Chemical Reviews, Vol. 114, Issue 13
Complex archaea that bridge the gap between prokaryotes and eukaryotes
journal, May 2015
- Spang, Anja; Saw, Jimmy H.; Jørgensen, Steffen L.
- Nature, Vol. 521, Issue 7551
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
journal, December 2000
- Arabidopsis Genome Initiative,
- Nature, Vol. 408, Issue 6814, p. 796-815
PLAZA 3.0: an access point for plant comparative genomics
journal, October 2014
- Proost, Sebastian; Van Bel, Michiel; Vaneechoutte, Dries
- Nucleic Acids Research, Vol. 43, Issue D1
Monophyletic origins of the metazoa: an evolutionary link with fungi
journal, April 1993
- Wainright, P.; Hinkle, G.; Sogin, M.
- Science, Vol. 260, Issue 5106
Intrinsically disordered proteins: a 10-year recap
journal, December 2012
- Tompa, Peter
- Trends in Biochemical Sciences, Vol. 37, Issue 12
Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011
- Cherry, J. M.; Hong, E. L.; Amundsen, C.
- Nucleic Acids Research, Vol. 40, Issue D1
A genomic analysis of the archaeal system Ignicoccus hospitalis-Nanoarchaeum equitans
journal, January 2008
- Podar, Mircea; Anderson, Iain; Makarova, Kira S.
- Genome Biology, Vol. 9, Issue 11
Evolution of viruses and cells: do we need a fourth domain of life to explain the origin of eukaryotes?
journal, September 2015
- Moreira, David; López-García, Purificación
- Philosophical Transactions of the Royal Society B: Biological Sciences, Vol. 370, Issue 1678
Unexpected features of the dark proteome
journal, November 2015
- Perdigão, Nelson; Heinrich, Julian; Stolte, Christian
- Proceedings of the National Academy of Sciences, Vol. 112, Issue 52
Natively unfolded proteins: A point where biology waits for physics
journal, April 2002
- Uversky, V. N.
- Protein Science, Vol. 11, Issue 4
The Genome Sequence of Drosophila melanogaster
journal, March 2000
- Adams, M. D.
- Science, Vol. 287, Issue 5461
Works referencing / citing this record:
A Suggestion of Converting Protein Intrinsic Disorder to Structural Entropy Using Shannon’s Information Theory
journal, June 2019
- Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald
- Entropy, Vol. 21, Issue 6