skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Covering complete proteomes with X-ray structures: A current snapshot

Journal Article · · Acta Crystallographica. Section D: Biological Crystallography (Online)
 [1];  [1];  [1];  [1];  [1];  [2];  [1]
  1. University of Alberta, Edmonton, Alberta (Canada). Electrical and Computer Engineering.
  2. Argonne National Lab. (ANL), Argonne, IL (United States). Midwest Center for Structureal Genomics.

Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1212766
Journal Information:
Acta Crystallographica. Section D: Biological Crystallography (Online), Vol. 70, Issue 11; ISSN 1399-0047
Publisher:
International Union of CrystallographyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 27 works
Citation information provided by
Web of Science

References (65)

Domain-Based and Family-Specific Sequence Identity Thresholds Increase the Levels of Reliable Protein Function Transfer journal March 2009
Gene Ontology: tool for the unification of biology journal May 2000
Predicting protein crystallization propensity from protein sequence journal February 2010
Protein Structure Prediction and Structural Genomics journal October 2001
A public resource facilitating clinical use of genomes journal July 2012
The Protein Data Bank at 40: Reflecting on the Past to Prepare for the Future journal March 2012
The Protein Data Bank journal January 2000
The protein structure initiative structural genomics knowledgebase journal January 2009
Protein Biophysical Properties that Correlate with Crystallization Success in Thermotoga maritima: Maximum Clustering Strategy for Structural Genomics journal December 2004
Target selection and deselection at the Berkeley Structural Genomics Center journal November 2005
Structural Systems Biology Evaluation of Metabolic Thermotolerance in Escherichia coli journal June 2013
SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs journal September 2013
Prediction of protein crystallization using collocation of amino acid pairs journal April 2007
TargetDB: a target registration database for structural genomics projects journal May 2004
Structural proteomics of an archaeon
  • Edwards, Aled M.; Arrowsmith, Cheryl H.; Christendat, Dinesh
  • Nature Structural Biology, Vol. 7, Issue 10, p. 903-909 https://doi.org/10.1038/82823
journal October 2000
IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content journal June 2005
Search and clustering orders of magnitude faster than BLAST journal August 2010
Accessing complex crop genomes with next-generation sequencing journal September 2012
The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods journal April 2011
The NCBI BioSystems database journal October 2009
Comparative modeling for protein structure prediction journal April 2006
Mining the Structural Genomics Pipeline: Identification of Protein Properties that Affect High-throughput Experimental Analysis journal February 2004
Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures journal October 2012
Whither structural biology? journal January 2004
Improving the chances of successful protein structure determination with a random forest classifier journal February 2014
High-throughput crystallography for structural genomics☆ journal October 2009
SVMCRYS: An SVM Approach for the Prediction of Protein Crystallization Propensity from Protein Sequence journal April 2010
Distributions of pI versus pH provide prior information for the design of crystallization screening experiments: response to comment on 'Protein isoelectric point as a predictor for increased crystallization screening efficiency' journal August 2004
Protein isoelectric point as a predictor for increased crystallization screening efficiency journal February 2004
AAindex: amino acid index database, progress report 2008 journal December 2007
On the Universe of Protein Folds journal May 2013
The structure of the protein universe and genome evolution journal November 2002
The RCSB PDB information portal for structural genomics journal January 2006
Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline journal August 2002
Growth of novel protein structural data journal February 2007
Nature of the protein universe journal June 2009
Target space for structural genomics revisited journal July 2002
An Overview on GPCRs and Drug Discovery: Structure-Based Drug Design and Structural Biology on GPCRs book January 2009
Meta prediction of protein crystallization propensity journal December 2009
Sequence-based prediction of protein crystallization, purification and production propensity journal June 2011
CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics journal January 2012
Structural genomics is the largest contributor of novel structural leverage journal February 2009
Addressing the intrinsic disorder bottleneck in structural proteomics journal March 2005
Utilization of protein intrinsic disorder knowledge in structural proteomics journal February 2013
A normalised scale for structural genomics target ranking: The OB-Score journal June 2006
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction journal February 2008
XANNpred: Neural nets that predict the propensity of a protein to yield diffraction-quality crystals: XANNpred: Protein Crystallization Predictor
  • Overton, Ian M.; van Niekerk, C. A. Johannes; Barton, Geoffrey J.
  • Proteins: Structure, Function, and Bioinformatics, Vol. 79, Issue 4 https://doi.org/10.1002/prot.22914
journal January 2011
Coordinating the impact of structural genomics on the human α-helical transmembrane proteome journal February 2013
Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data journal December 2008
Making decisions for structural genomics journal January 2003
Receptor systems: Will mining the receptorome yield novel targets for pharmacotherapy? journal October 2005
A moving story of receptors journal September 2008
The challenge of protein structure determination-lessons from structural genomics journal November 2007
XtalPred: a web server for prediction of protein crystallizability journal October 2007
Will my protein crystallize? A sequence-based predictor journal November 2005
Completeness in structural genomics journal June 2001
Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of Life journal March 2004
Making the most of affinity tags journal June 2005
Insights from Genomics into Bacterial Pathogen Populations journal September 2012
Estimating the number of protein folds and families from complete genome data 1 1Edited by J. Thornton journal June 2000
Statistics of local complexity in amino acid sequences and sequence databases journal June 1993
Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment journal May 2013
Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life journal June 2012
Viral Disorder or Disordered Viruses: Do Viral Proteins Possess Unique Features? journal August 2010
Structure-based prediction of protein–protein interactions on a genome-wide scale journal September 2012

Cited By (4)


Similar Records

Covering complete proteomes with X-ray structures: a current snapshot
Journal Article · Sat Nov 01 00:00:00 EDT 2014 · Acta Crystallographica. Section D: Biological Crystallography · OSTI ID:1212766

Covering complete proteomes with X-ray structures: a current snapshot
Journal Article · Thu Oct 23 00:00:00 EDT 2014 · Acta Crystallographica. Section D: Biological Crystallography (Online) · OSTI ID:1212766

Letter to the Editor: H-1, C-13 and N-15 Assignments for the Archaeglobus fulgidis Protein AF2095.
Journal Article · Wed Sep 01 00:00:00 EDT 2004 · Journal of Biomolecular NMR · OSTI ID:1212766