DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Covering complete proteomes with X-ray structures: A current snapshot

Abstract

Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-raymore » structure determination were determined.« less

Authors:
 [1];  [1];  [1];  [1];  [1];  [2];  [1]
  1. University of Alberta, Edmonton, Alberta (Canada). Electrical and Computer Engineering.
  2. Argonne National Lab. (ANL), Argonne, IL (United States). Midwest Center for Structureal Genomics.
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
OSTI Identifier:
1212766
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
Acta Crystallographica. Section D: Biological Crystallography (Online)
Additional Journal Information:
Journal Name: Acta Crystallographica. Section D: Biological Crystallography (Online); Journal Volume: 70; Journal Issue: 11; Journal ID: ISSN 1399-0047
Publisher:
International Union of Crystallography
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 75 CONDENSED MATTER PHYSICS, SUPERCONDUCTIVITY AND SUPERFLUIDITY; crystallization propensity; proteome coverage; fDETECT

Citation Formats

Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, and Kurgan, Lukasz. Covering complete proteomes with X-ray structures: A current snapshot. United States: N. p., 2014. Web. doi:10.1107/S1399004714019427.
Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, & Kurgan, Lukasz. Covering complete proteomes with X-ray structures: A current snapshot. United States. https://doi.org/10.1107/S1399004714019427
Mizianty, Marcin J., Fan, Xiao, Yan, Jing, Chalmers, Eric, Woloschuk, Christopher, Joachimiak, Andrzej, and Kurgan, Lukasz. Thu . "Covering complete proteomes with X-ray structures: A current snapshot". United States. https://doi.org/10.1107/S1399004714019427. https://www.osti.gov/servlets/purl/1212766.
@article{osti_1212766,
title = {Covering complete proteomes with X-ray structures: A current snapshot},
author = {Mizianty, Marcin J. and Fan, Xiao and Yan, Jing and Chalmers, Eric and Woloschuk, Christopher and Joachimiak, Andrzej and Kurgan, Lukasz},
abstractNote = {Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.},
doi = {10.1107/S1399004714019427},
journal = {Acta Crystallographica. Section D: Biological Crystallography (Online)},
number = 11,
volume = 70,
place = {United States},
year = {Thu Oct 23 00:00:00 EDT 2014},
month = {Thu Oct 23 00:00:00 EDT 2014}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 27 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Domain-Based and Family-Specific Sequence Identity Thresholds Increase the Levels of Reliable Protein Function Transfer
journal, March 2009


Gene Ontology: tool for the unification of biology
journal, May 2000

  • Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
  • Nature Genetics, Vol. 25, Issue 1
  • DOI: 10.1038/75556

Predicting protein crystallization propensity from protein sequence
journal, February 2010

  • Babnigg, György; Joachimiak, Andrzej
  • Journal of Structural and Functional Genomics, Vol. 11, Issue 1
  • DOI: 10.1007/s10969-010-9080-0

Protein Structure Prediction and Structural Genomics
journal, October 2001


A public resource facilitating clinical use of genomes
journal, July 2012

  • Ball, Madeleine P.; Thakuria, Joseph V.; Zaranek, Alexander Wait
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 30
  • DOI: 10.1073/pnas.1201904109

The Protein Data Bank at 40: Reflecting on the Past to Prepare for the Future
journal, March 2012


The Protein Data Bank
journal, January 2000


The protein structure initiative structural genomics knowledgebase
journal, January 2009

  • Berman, H. M.; Westbrook, J. D.; Gabanyi, M. J.
  • Nucleic Acids Research, Vol. 37, Issue Database
  • DOI: 10.1093/nar/gkn790

Protein Biophysical Properties that Correlate with Crystallization Success in Thermotoga maritima: Maximum Clustering Strategy for Structural Genomics
journal, December 2004

  • Canaves, Jaume M.; Page, Rebecca; Wilson, Ian A.
  • Journal of Molecular Biology, Vol. 344, Issue 4
  • DOI: 10.1016/j.jmb.2004.09.076

Target selection and deselection at the Berkeley Structural Genomics Center
journal, November 2005

  • Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.
  • Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 2
  • DOI: 10.1002/prot.20674

Structural Systems Biology Evaluation of Metabolic Thermotolerance in Escherichia coli
journal, June 2013


SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs
journal, September 2013


Prediction of protein crystallization using collocation of amino acid pairs
journal, April 2007

  • Chen, Ke; Kurgan, Lukasz; Rahbari, Mandana
  • Biochemical and Biophysical Research Communications, Vol. 355, Issue 3
  • DOI: 10.1016/j.bbrc.2007.02.040

TargetDB: a target registration database for structural genomics projects
journal, May 2004


Structural proteomics of an archaeon
journal, October 2000

  • Edwards, Aled M.; Arrowsmith, Cheryl H.; Christendat, Dinesh
  • Nature Structural Biology, Vol. 7, Issue 10, p. 903-909
  • DOI: 10.1038/82823

IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content
journal, June 2005


Search and clustering orders of magnitude faster than BLAST
journal, August 2010


Accessing complex crop genomes with next-generation sequencing
journal, September 2012

  • Edwards, David; Batley, Jacqueline; Snowdon, Rod J.
  • Theoretical and Applied Genetics, Vol. 126, Issue 1
  • DOI: 10.1007/s00122-012-1964-x

The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods
journal, April 2011

  • Gabanyi, Margaret J.; Adams, Paul D.; Arnold, Konstantin
  • Journal of Structural and Functional Genomics, Vol. 12, Issue 2
  • DOI: 10.1007/s10969-011-9106-2

The NCBI BioSystems database
journal, October 2009

  • Geer, Lewis Y.; Marchler-Bauer, Aron; Geer, Renata C.
  • Nucleic Acids Research, Vol. 38, Issue suppl_1
  • DOI: 10.1093/nar/gkp858

Comparative modeling for protein structure prediction
journal, April 2006


Mining the Structural Genomics Pipeline: Identification of Protein Properties that Affect High-throughput Experimental Analysis
journal, February 2004

  • Goh, Chern-Sing; Lan, Ning; Douglas, Shawn M.
  • Journal of Molecular Biology, Vol. 336, Issue 1
  • DOI: 10.1016/j.jmb.2003.11.053

Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures
journal, October 2012

  • Gront, Dominik; Grabowski, Marek; Zimmerman, Matthew D.
  • Journal of Structural and Functional Genomics, Vol. 13, Issue 4
  • DOI: 10.1007/s10969-012-9146-2

Whither structural biology?
journal, January 2004

  • Harrison, Stephen C.
  • Nature Structural & Molecular Biology, Vol. 11, Issue 1
  • DOI: 10.1038/nsmb0104-12

Improving the chances of successful protein structure determination with a random forest classifier
journal, February 2014

  • Jahandideh, Samad; Jaroszewski, Lukasz; Godzik, Adam
  • Acta Crystallographica Section D Biological Crystallography, Vol. 70, Issue 3
  • DOI: 10.1107/S1399004713032070

High-throughput crystallography for structural genomics☆
journal, October 2009


SVMCRYS: An SVM Approach for the Prediction of Protein Crystallization Propensity from Protein Sequence
journal, April 2010

  • Kandaswamy, Krishna; Pugalenthi, Ganesan; Suganthan, Pn
  • Protein & Peptide Letters, Vol. 17, Issue 4
  • DOI: 10.2174/092986610790963726

Protein isoelectric point as a predictor for increased crystallization screening efficiency
journal, February 2004


AAindex: amino acid index database, progress report 2008
journal, December 2007

  • Kawashima, S.; Pokarowski, P.; Pokarowska, M.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm998

On the Universe of Protein Folds
journal, May 2013


The structure of the protein universe and genome evolution
journal, November 2002

  • Koonin, Eugene V.; Wolf, Yuri I.; Karev, Georgy P.
  • Nature, Vol. 420, Issue 6912
  • DOI: 10.1038/nature01256

The RCSB PDB information portal for structural genomics
journal, January 2006


Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline
journal, August 2002

  • Lesley, S. A.; Kuhn, P.; Godzik, A.
  • Proceedings of the National Academy of Sciences, Vol. 99, Issue 18
  • DOI: 10.1073/pnas.142413399

Growth of novel protein structural data
journal, February 2007


Nature of the protein universe
journal, June 2009


Target space for structural genomics revisited
journal, July 2002


Meta prediction of protein crystallization propensity
journal, December 2009

  • Mizianty, Marcin J.; Kurgan, Lukasz
  • Biochemical and Biophysical Research Communications, Vol. 390, Issue 1
  • DOI: 10.1016/j.bbrc.2009.09.036

Sequence-based prediction of protein crystallization, purification and production propensity
journal, June 2011


CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
journal, January 2012


Structural genomics is the largest contributor of novel structural leverage
journal, February 2009

  • Nair, Rajesh; Liu, Jinfeng; Soong, Ta-Tsen
  • Journal of Structural and Functional Genomics, Vol. 10, Issue 2
  • DOI: 10.1007/s10969-008-9055-6

Addressing the intrinsic disorder bottleneck in structural proteomics
journal, March 2005

  • Oldfield, Christopher J.; Ulrich, Eldon L.; Cheng, Yugong
  • Proteins: Structure, Function, and Bioinformatics, Vol. 59, Issue 3
  • DOI: 10.1002/prot.20446

Utilization of protein intrinsic disorder knowledge in structural proteomics
journal, February 2013

  • Oldfield, Christopher J.; Xue, Bin; Van, Ya-Yue
  • Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, Vol. 1834, Issue 2
  • DOI: 10.1016/j.bbapap.2012.12.003

A normalised scale for structural genomics target ranking: The OB-Score
journal, June 2006


ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction
journal, February 2008


XANNpred: Neural nets that predict the propensity of a protein to yield diffraction-quality crystals: XANNpred: Protein Crystallization Predictor
journal, January 2011

  • Overton, Ian M.; van Niekerk, C. A. Johannes; Barton, Geoffrey J.
  • Proteins: Structure, Function, and Bioinformatics, Vol. 79, Issue 4
  • DOI: 10.1002/prot.22914

Coordinating the impact of structural genomics on the human α-helical transmembrane proteome
journal, February 2013

  • Pieper, Ursula; Schlessinger, Avner; Kloppmann, Edda
  • Nature Structural & Molecular Biology, Vol. 20, Issue 2
  • DOI: 10.1038/nsmb.2508

Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data
journal, December 2008

  • Price II, W. Nicholson; Chen, Yang; Handelman, Samuel K.
  • Nature Biotechnology, Vol. 27, Issue 1
  • DOI: 10.1038/nbt.1514

Making decisions for structural genomics
journal, January 2003


Receptor systems: Will mining the receptorome yield novel targets for pharmacotherapy?
journal, October 2005


A moving story of receptors
journal, September 2008

  • Schwartz, Thue W.; Hubbell, Wayne L.
  • Nature, Vol. 455, Issue 7212
  • DOI: 10.1038/455473a

The challenge of protein structure determination-lessons from structural genomics
journal, November 2007

  • Slabinski, Lukasz; Jaroszewski, Lukasz; Rodrigues, Ana P. C.
  • Protein Science, Vol. 16, Issue 11
  • DOI: 10.1110/ps.073037907

XtalPred: a web server for prediction of protein crystallizability
journal, October 2007


Will my protein crystallize? A sequence-based predictor
journal, November 2005

  • Smialowski, Pawel; Schmidt, Thorsten; Cox, Jürgen
  • Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 2
  • DOI: 10.1002/prot.20789

Completeness in structural genomics
journal, June 2001

  • Vitkup, Dennis; Melamud, Eugene; Moult, John
  • Nature Structural Biology, Vol. 8, Issue 6, p. 559-566
  • DOI: 10.1038/88640

Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of Life
journal, March 2004


Making the most of affinity tags
journal, June 2005


Insights from Genomics into Bacterial Pathogen Populations
journal, September 2012


Estimating the number of protein folds and families from complete genome data 1 1Edited by J. Thornton
journal, June 2000

  • Wolf, Yuri I.; Grishin, Nick V.; Koonin, Eugene V.
  • Journal of Molecular Biology, Vol. 299, Issue 4
  • DOI: 10.1006/jmbi.2000.3786

Statistics of local complexity in amino acid sequences and sequence databases
journal, June 1993


Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life
journal, June 2012

  • Xue, Bin; Dunker, A. Keith; Uversky, Vladimir N.
  • Journal of Biomolecular Structure and Dynamics, Vol. 30, Issue 2
  • DOI: 10.1080/07391102.2012.675145

Viral Disorder or Disordered Viruses: Do Viral Proteins Possess Unique Features?
journal, August 2010

  • Xue, Bin; W. Williams, Robert; J. Oldfield, Christopher
  • Protein & Peptide Letters, Vol. 17, Issue 8
  • DOI: 10.2174/092986610791498984

Structure-based prediction of protein–protein interactions on a genome-wide scale
journal, September 2012

  • Zhang, Qiangfeng Cliff; Petrey, Donald; Deng, Lei
  • Nature, Vol. 490, Issue 7421
  • DOI: 10.1038/nature11503

Works referencing / citing this record:

Structural and functional analysis of “non-smelly” proteins
journal, September 2019

  • Yan, Jing; Cheng, Jianlin; Kurgan, Lukasz
  • Cellular and Molecular Life Sciences, Vol. 77, Issue 12
  • DOI: 10.1007/s00018-019-03292-1