skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Protein Data Bank archive as an open data resource

Abstract

The Protein Data Bank archive was established in 1971, and recently celebrated its 40th anniversary (Berman et al. in Structure 20:391, 2012). Here, an analysis of interrelationships of the science, technology and community leads to further insights into how this resource evolved into one of the oldest and most widely used open-access data resources in biology.

Authors:
 [1];  [2];  [3];  [4]
  1. Rutgers Univ., Piscataway, NJ (United States). Center for Integrative Proteomics Research, Dept. of Chemistry and Chemical Biology
  2. European Bioinformatics Inst., Cambridge (United Kingdom). European Molecular Biology Lab.
  3. Osaka Univ. (Japan). Inst. for Protein Research
  4. Univ. of Wisconsin, Madison, WI (United States). Dept. of Biochemistry
Publication Date:
Research Org.:
Rutgers Univ., Piscataway, NJ (United States); National Science Foundation (NSF), Arlington, VA (United States)
Sponsoring Org.:
USDOE; National Science Foundation (NSF)
OSTI Identifier:
1354849
Grant/Contract Number:
SC0008434; DBI-1338415
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Journal of Computer-Aided Molecular Design
Additional Journal Information:
Journal Volume: 28; Journal Issue: 10; Journal ID: ISSN 0920-654X
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Protein Data Bank; Protein structure; Biomacromolecules; Data archive

Citation Formats

Berman, Helen M., Kleywegt, Gerard J., Nakamura, Haruki, and Markley, John L.. The Protein Data Bank archive as an open data resource. United States: N. p., 2014. Web. doi:10.1007/s10822-014-9770-y.
Berman, Helen M., Kleywegt, Gerard J., Nakamura, Haruki, & Markley, John L.. The Protein Data Bank archive as an open data resource. United States. doi:10.1007/s10822-014-9770-y.
Berman, Helen M., Kleywegt, Gerard J., Nakamura, Haruki, and Markley, John L.. Sat . "The Protein Data Bank archive as an open data resource". United States. doi:10.1007/s10822-014-9770-y. https://www.osti.gov/servlets/purl/1354849.
@article{osti_1354849,
title = {The Protein Data Bank archive as an open data resource},
author = {Berman, Helen M. and Kleywegt, Gerard J. and Nakamura, Haruki and Markley, John L.},
abstractNote = {The Protein Data Bank archive was established in 1971, and recently celebrated its 40th anniversary (Berman et al. in Structure 20:391, 2012). Here, an analysis of interrelationships of the science, technology and community leads to further insights into how this resource evolved into one of the oldest and most widely used open-access data resources in biology.},
doi = {10.1007/s10822-014-9770-y},
journal = {Journal of Computer-Aided Molecular Design},
number = 10,
volume = 28,
place = {United States},
year = {Sat Jul 26 00:00:00 EDT 2014},
month = {Sat Jul 26 00:00:00 EDT 2014}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 30works
Citation information provided by
Web of Science

Save / Share:
  • Here, protein-protein interactions are ubiquitous and essential for cellular processes. High-resolution X-ray crystallographic structures of protein complexes can elucidate the details of their function and provide a basis for many computational and experimental approaches. Here we demonstrate that existing annotations of protein complexes, including those provided by the Protein Data Bank (PDB) itself, contain a significant fraction of incorrect annotations. Results: We have developed a method for identifying protein complexes in the PDB X-ray structures by a four step procedure: (1) comprehensively collecting all protein-protein interfaces; (2) clustering similar protein-protein interfaces together; (3) estimating the probability that each cluster ismore » relevant based on a diverse set of properties; and (4) finally combining these scores for each entry in order to predict the complex structure. Unlike previous annotation methods, consistent prediction of complexes with identical or almost identical protein content is insured. The resulting clusters of biologically relevant interfaces provide a reliable catalog of evolutionary conserved protein-protein interactions.« less
  • In a large-scale study using data from the Protein Data Bank, some of the many reported findings regarding the crystallization of proteins were investigated. The Protein Data Bank (PDB) is the largest available repository of solved protein structures and contains a wealth of information on successful crystallization. Many centres have used their own experimental data to draw conclusions about proteins and the conditions in which they crystallize. Here, data from the PDB were used to reanalyse some of these results. The most successful crystallization reagents were identified, the link between solution pH and the isoelectric point of the protein wasmore » investigated and the possibility of predicting whether a protein will crystallize was explored.« less
  • The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) resource provides tools for query, analysis and visualization of the 3D structures in the PDB archive. As the mobile Web is starting to surpass desktop and laptop usage, scientists and educators are beginning to integrate mobile devices into their research and teaching. In response, we have developed the RCSB PDB Mobile app for the iOS and Android mobile platforms to enable fast and convenient access to RCSB PDB data and services. Lastly, using the app, users from the general public to expert researchers can quickly search and visualize biomolecules,more » and add personal annotations via the RCSB PDB's integrated MyPDB service.« less
  • The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of proteins, nucleic acids, and other biological macromolecules with a 25 year history of service to a global community. PDB is being replaced by 3DB, the Three-Dimensional Database of Biomolecular Structures that will continue to operate from Brookhaven National Laboratory. 3DB will be a highly sophisticated knowledge-based system for archiving and accessing structural information that combines the advantages of object oriented and relational database systems. 3DB will operate as a direct-deposition archive that will also accept third-party supplied annotations. Conversion of PDB to 3DB will be evolutionary,more » providing a high degree of compatibility with existing software.« less