Covering complete proteomes with X-ray structures: A current snapshot
Abstract
Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-raymore »
- Authors:
-
- University of Alberta, Edmonton, Alberta (Canada). Electrical and Computer Engineering.
- Argonne National Lab. (ANL), Argonne, IL (United States). Midwest Center for Structureal Genomics.
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER)
- OSTI Identifier:
- 1212766
- Grant/Contract Number:
- AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Acta Crystallographica. Section D: Biological Crystallography (Online)
- Additional Journal Information:
- Journal Name: Acta Crystallographica. Section D: Biological Crystallography (Online); Journal Volume: 70; Journal Issue: 11; Journal ID: ISSN 1399-0047
- Publisher:
- International Union of Crystallography
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 75 CONDENSED MATTER PHYSICS, SUPERCONDUCTIVITY AND SUPERFLUIDITY; crystallization propensity; proteome coverage; fDETECT
Citation Formats
Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, and Kurgan, Lukasz. Covering complete proteomes with X-ray structures: A current snapshot. United States: N. p., 2014.
Web. doi:10.1107/S1399004714019427.
Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, & Kurgan, Lukasz. Covering complete proteomes with X-ray structures: A current snapshot. United States. https://doi.org/10.1107/S1399004714019427
Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, and Kurgan, Lukasz. Thu .
"Covering complete proteomes with X-ray structures: A current snapshot". United States. https://doi.org/10.1107/S1399004714019427. https://www.osti.gov/servlets/purl/1212766.
@article{osti_1212766,
title = {Covering complete proteomes with X-ray structures: A current snapshot},
author = {Mizianty, Marcin J. and Fan, Xiao and Yan, Jing and Chalmers, Eric and Woloschuk, Christopher and Joachimiak, Andrzej and Kurgan, Lukasz},
abstractNote = {Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.},
doi = {10.1107/S1399004714019427},
journal = {Acta Crystallographica. Section D: Biological Crystallography (Online)},
number = 11,
volume = 70,
place = {United States},
year = {Thu Oct 23 00:00:00 EDT 2014},
month = {Thu Oct 23 00:00:00 EDT 2014}
}
Web of Science
Works referenced in this record:
Domain-Based and Family-Specific Sequence Identity Thresholds Increase the Levels of Reliable Protein Function Transfer
journal, March 2009
- Addou, Sarah; Rentzsch, Robert; Lee, David
- Journal of Molecular Biology, Vol. 387, Issue 2
Gene Ontology: tool for the unification of biology
journal, May 2000
- Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
- Nature Genetics, Vol. 25, Issue 1
Predicting protein crystallization propensity from protein sequence
journal, February 2010
- Babnigg, György; Joachimiak, Andrzej
- Journal of Structural and Functional Genomics, Vol. 11, Issue 1
Protein Structure Prediction and Structural Genomics
journal, October 2001
- Baker, D.
- Science, Vol. 294, Issue 5540
A public resource facilitating clinical use of genomes
journal, July 2012
- Ball, Madeleine P.; Thakuria, Joseph V.; Zaranek, Alexander Wait
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 30
The Protein Data Bank at 40: Reflecting on the Past to Prepare for the Future
journal, March 2012
- Berman, Helen M.; Kleywegt, Gerard J.; Nakamura, Haruki
- Structure, Vol. 20, Issue 3
The protein structure initiative structural genomics knowledgebase
journal, January 2009
- Berman, H. M.; Westbrook, J. D.; Gabanyi, M. J.
- Nucleic Acids Research, Vol. 37, Issue Database
Protein Biophysical Properties that Correlate with Crystallization Success in Thermotoga maritima: Maximum Clustering Strategy for Structural Genomics
journal, December 2004
- Canaves, Jaume M.; Page, Rebecca; Wilson, Ian A.
- Journal of Molecular Biology, Vol. 344, Issue 4
Target selection and deselection at the Berkeley Structural Genomics Center
journal, November 2005
- Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.
- Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 2
Structural Systems Biology Evaluation of Metabolic Thermotolerance in Escherichia coli
journal, June 2013
- Chang, R. L.; Andrews, K.; Kim, D.
- Science, Vol. 340, Issue 6137, p. 1220-1223
SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs
journal, September 2013
- Charoenkwan, Phasit; Shoombuatong, Watshara; Lee, Hua-Chin
- PLoS ONE, Vol. 8, Issue 9
Prediction of protein crystallization using collocation of amino acid pairs
journal, April 2007
- Chen, Ke; Kurgan, Lukasz; Rahbari, Mandana
- Biochemical and Biophysical Research Communications, Vol. 355, Issue 3
TargetDB: a target registration database for structural genomics projects
journal, May 2004
- Chen, L.; Oughtred, R.; Berman, H. M.
- Bioinformatics, Vol. 20, Issue 16
Structural proteomics of an archaeon
journal, October 2000
- Edwards, Aled M.; Arrowsmith, Cheryl H.; Christendat, Dinesh
- Nature Structural Biology, Vol. 7, Issue 10, p. 903-909
IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content
journal, June 2005
- Dosztanyi, Z.; Csizmok, V.; Tompa, P.
- Bioinformatics, Vol. 21, Issue 16
Search and clustering orders of magnitude faster than BLAST
journal, August 2010
- Edgar, Robert C.
- Bioinformatics, Vol. 26, Issue 19, p. 2460-2461
Accessing complex crop genomes with next-generation sequencing
journal, September 2012
- Edwards, David; Batley, Jacqueline; Snowdon, Rod J.
- Theoretical and Applied Genetics, Vol. 126, Issue 1
The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods
journal, April 2011
- Gabanyi, Margaret J.; Adams, Paul D.; Arnold, Konstantin
- Journal of Structural and Functional Genomics, Vol. 12, Issue 2
The NCBI BioSystems database
journal, October 2009
- Geer, Lewis Y.; Marchler-Bauer, Aron; Geer, Renata C.
- Nucleic Acids Research, Vol. 38, Issue suppl_1
Comparative modeling for protein structure prediction
journal, April 2006
- Ginalski, Krzysztof
- Current Opinion in Structural Biology, Vol. 16, Issue 2
Mining the Structural Genomics Pipeline: Identification of Protein Properties that Affect High-throughput Experimental Analysis
journal, February 2004
- Goh, Chern-Sing; Lan, Ning; Douglas, Shawn M.
- Journal of Molecular Biology, Vol. 336, Issue 1
Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures
journal, October 2012
- Gront, Dominik; Grabowski, Marek; Zimmerman, Matthew D.
- Journal of Structural and Functional Genomics, Vol. 13, Issue 4
Whither structural biology?
journal, January 2004
- Harrison, Stephen C.
- Nature Structural & Molecular Biology, Vol. 11, Issue 1
Improving the chances of successful protein structure determination with a random forest classifier
journal, February 2014
- Jahandideh, Samad; Jaroszewski, Lukasz; Godzik, Adam
- Acta Crystallographica Section D Biological Crystallography, Vol. 70, Issue 3
High-throughput crystallography for structural genomics☆
journal, October 2009
- Joachimiak, Andrzej
- Current Opinion in Structural Biology, Vol. 19, Issue 5
SVMCRYS: An SVM Approach for the Prediction of Protein Crystallization Propensity from Protein Sequence
journal, April 2010
- Kandaswamy, Krishna; Pugalenthi, Ganesan; Suganthan, Pn
- Protein & Peptide Letters, Vol. 17, Issue 4
Distributions of pI versus pH provide prior information for the design of crystallization screening experiments: response to comment on 'Protein isoelectric point as a predictor for increased crystallization screening efficiency'
journal, August 2004
- Kantardjieff, K. A.; Jamshidian, M.; Rupp, B.
- Bioinformatics, Vol. 20, Issue 14
Protein isoelectric point as a predictor for increased crystallization screening efficiency
journal, February 2004
- Kantardjieff, K. A.; Rupp, B.
- Bioinformatics, Vol. 20, Issue 14
AAindex: amino acid index database, progress report 2008
journal, December 2007
- Kawashima, S.; Pokarowski, P.; Pokarowska, M.
- Nucleic Acids Research, Vol. 36, Issue Database
On the Universe of Protein Folds
journal, May 2013
- Kolodny, Rachel; Pereyaslavets, Leonid; Samson, Abraham O.
- Annual Review of Biophysics, Vol. 42, Issue 1
The structure of the protein universe and genome evolution
journal, November 2002
- Koonin, Eugene V.; Wolf, Yuri I.; Karev, Georgy P.
- Nature, Vol. 420, Issue 6912
The RCSB PDB information portal for structural genomics
journal, January 2006
- Kouranov, A.
- Nucleic Acids Research, Vol. 34, Issue 90001
Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline
journal, August 2002
- Lesley, S. A.; Kuhn, P.; Godzik, A.
- Proceedings of the National Academy of Sciences, Vol. 99, Issue 18
Growth of novel protein structural data
journal, February 2007
- Levitt, M.
- Proceedings of the National Academy of Sciences, Vol. 104, Issue 9
Nature of the protein universe
journal, June 2009
- Levitt, M.
- Proceedings of the National Academy of Sciences, Vol. 106, Issue 27
Target space for structural genomics revisited
journal, July 2002
- Liu, J.; Rost, B.
- Bioinformatics, Vol. 18, Issue 7
An Overview on GPCRs and Drug Discovery: Structure-Based Drug Design and Structural Biology on GPCRs
book, January 2009
- Lundstrom, Kenneth
- Methods in Molecular Biology
Meta prediction of protein crystallization propensity
journal, December 2009
- Mizianty, Marcin J.; Kurgan, Lukasz
- Biochemical and Biophysical Research Communications, Vol. 390, Issue 1
Sequence-based prediction of protein crystallization, purification and production propensity
journal, June 2011
- Mizianty, Marcin J.; Kurgan, Lukasz
- Bioinformatics, Vol. 27, Issue 13
CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
journal, January 2012
- J. Mizianty, Marcin; A. Kurgan, Lukasz
- Protein & Peptide Letters, Vol. 19, Issue 1
Structural genomics is the largest contributor of novel structural leverage
journal, February 2009
- Nair, Rajesh; Liu, Jinfeng; Soong, Ta-Tsen
- Journal of Structural and Functional Genomics, Vol. 10, Issue 2
Addressing the intrinsic disorder bottleneck in structural proteomics
journal, March 2005
- Oldfield, Christopher J.; Ulrich, Eldon L.; Cheng, Yugong
- Proteins: Structure, Function, and Bioinformatics, Vol. 59, Issue 3
Utilization of protein intrinsic disorder knowledge in structural proteomics
journal, February 2013
- Oldfield, Christopher J.; Xue, Bin; Van, Ya-Yue
- Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, Vol. 1834, Issue 2
A normalised scale for structural genomics target ranking: The OB-Score
journal, June 2006
- Overton, Ian M.; Barton, Geoffrey J.
- FEBS Letters, Vol. 580, Issue 16
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction
journal, February 2008
- Overton, Ian M.; Padovani, Gianandrea; Girolami, Mark A.
- Bioinformatics, Vol. 24, Issue 7
XANNpred: Neural nets that predict the propensity of a protein to yield diffraction-quality crystals: XANNpred: Protein Crystallization Predictor
journal, January 2011
- Overton, Ian M.; van Niekerk, C. A. Johannes; Barton, Geoffrey J.
- Proteins: Structure, Function, and Bioinformatics, Vol. 79, Issue 4
Coordinating the impact of structural genomics on the human α-helical transmembrane proteome
journal, February 2013
- Pieper, Ursula; Schlessinger, Avner; Kloppmann, Edda
- Nature Structural & Molecular Biology, Vol. 20, Issue 2
Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data
journal, December 2008
- Price II, W. Nicholson; Chen, Yang; Handelman, Samuel K.
- Nature Biotechnology, Vol. 27, Issue 1
Making decisions for structural genomics
journal, January 2003
- Rodrigues, A.
- Briefings in Bioinformatics, Vol. 4, Issue 2
Receptor systems: Will mining the receptorome yield novel targets for pharmacotherapy?
journal, October 2005
- Roth, Bryan L.
- Pharmacology & Therapeutics, Vol. 108, Issue 1
A moving story of receptors
journal, September 2008
- Schwartz, Thue W.; Hubbell, Wayne L.
- Nature, Vol. 455, Issue 7212
The challenge of protein structure determination-lessons from structural genomics
journal, November 2007
- Slabinski, Lukasz; Jaroszewski, Lukasz; Rodrigues, Ana P. C.
- Protein Science, Vol. 16, Issue 11
XtalPred: a web server for prediction of protein crystallizability
journal, October 2007
- Slabinski, Lukasz; Jaroszewski, Lukasz; Rychlewski, Leszek
- Bioinformatics, Vol. 23, Issue 24
Will my protein crystallize? A sequence-based predictor
journal, November 2005
- Smialowski, Pawel; Schmidt, Thorsten; Cox, Jürgen
- Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 2
Completeness in structural genomics
journal, June 2001
- Vitkup, Dennis; Melamud, Eugene; Moult, John
- Nature Structural Biology, Vol. 8, Issue 6, p. 559-566
Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of Life
journal, March 2004
- Ward, J. J.; Sodhi, J. S.; McGuffin, L. J.
- Journal of Molecular Biology, Vol. 337, Issue 3
Making the most of affinity tags
journal, June 2005
- Waugh, David S.
- Trends in Biotechnology, Vol. 23, Issue 6
Insights from Genomics into Bacterial Pathogen Populations
journal, September 2012
- Wilson, Daniel J.
- PLoS Pathogens, Vol. 8, Issue 9
Estimating the number of protein folds and families from complete genome data 1 1Edited by J. Thornton
journal, June 2000
- Wolf, Yuri I.; Grishin, Nick V.; Koonin, Eugene V.
- Journal of Molecular Biology, Vol. 299, Issue 4
Statistics of local complexity in amino acid sequences and sequence databases
journal, June 1993
- Wootton, John C.; Federhen, Scott
- Computers & Chemistry, Vol. 17, Issue 2, p. 149-163
Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment
journal, May 2013
- Xu, Dong; Zhang, Yang
- Scientific Reports, Vol. 3, Issue 1
Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life
journal, June 2012
- Xue, Bin; Dunker, A. Keith; Uversky, Vladimir N.
- Journal of Biomolecular Structure and Dynamics, Vol. 30, Issue 2
Viral Disorder or Disordered Viruses: Do Viral Proteins Possess Unique Features?
journal, August 2010
- Xue, Bin; W. Williams, Robert; J. Oldfield, Christopher
- Protein & Peptide Letters, Vol. 17, Issue 8
Structure-based prediction of protein–protein interactions on a genome-wide scale
journal, September 2012
- Zhang, Qiangfeng Cliff; Petrey, Donald; Deng, Lei
- Nature, Vol. 490, Issue 7421
Works referencing / citing this record:
Taxonomic Landscape of the Dark Proteomes: Whole-Proteome Scale Interplay Between Structural Darkness, Intrinsic Disorder, and Crystallization Propensity
journal, October 2018
- Hu, Gang; Wang, Kui; Song, Jiangning
- PROTEOMICS, Vol. 18, Issue 21-22
DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions
journal, May 2019
- Zhang, Fuhao; Song, Hong; Zeng, Min
- PROTEOMICS, Vol. 19, Issue 12
Structural and functional analysis of “non-smelly” proteins
journal, September 2019
- Yan, Jing; Cheng, Jianlin; Kurgan, Lukasz
- Cellular and Molecular Life Sciences, Vol. 77, Issue 12
Review and comparative assessment of similarity-based methods for prediction of drug–protein interactions in the druggable human proteome
journal, August 2018
- Wang, Chen; Kurgan, Lukasz
- Briefings in Bioinformatics, Vol. 20, Issue 6