DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D

Abstract

Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB‐designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three‐dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro‐electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research‐focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complexmore » biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.« less

Authors:
ORCiD logo [1];  [2];  [3];  [3];  [2];  [2]; ORCiD logo [3];  [2];  [2];  [2];  [2];  [4]; ORCiD logo [5];  [2];  [2];  [2];  [3];  [2];  [2];  [2] more »;  [2];  [2];  [2];  [2];  [3];  [4];  [3];  [2];  [2];  [2];  [2]; ORCiD logo [2];  [2];  [2]; ORCiD logo [2] « less
  1. Rutgers University, Piscataway, NJ (United States); University of California San Diego, La Jolla, CA (United States)
  2. Rutgers University, Piscataway, NJ (United States)
  3. University of California San Diego, La Jolla, CA (United States)
  4. University of California, San Francisco, CA (United States)
  5. Rutgers University, Piscataway, NJ (United States); The Scripps Research Institute, La Jolla, CA (United States)
Publication Date:
Research Org.:
Rutgers Univ., Piscataway, NJ (United States)
Sponsoring Org.:
USDOE Office of Science (SC); National Science Foundation (NSF); National Cancer Institute (NCI); National Institute of Allergy and Infectious Diseases (NIAID); National Institutes of Health (NIH)
OSTI Identifier:
1976373
Alternate Identifier(s):
OSTI ID: 1829368
Grant/Contract Number:  
SC0019749; DBI-1832184; R01GM133198; DE‐SC0019749
Resource Type:
Accepted Manuscript
Journal Name:
Protein Science
Additional Journal Information:
Journal Volume: 31; Journal Issue: 1; Journal ID: ISSN 0961-8368
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; protein data bank; PDB; open access; RCSB protein data bank; worldwide protein data bank; Mol*; web-native molecular graphics; macromolecular crystallography; electron microscopy; micro-electron diffraction

Citation Formats

Burley, Stephen K., Bhikadiya, Charmi, Bi, Chunxiao, Bittrich, Sebastian, Chen, Li, Crichlow, Gregg V., Duarte, Jose M., Dutta, Shuchismita, Fayazi, Maryam, Feng, Zukang, Flatt, Justin W., Ganesan, Sai J., Goodsell, David S., Ghosh, Sutapa, Kramer Green, Rachel, Guranovic, Vladimir, Henry, Jeremy, Hudson, Brian P., Lawson, Catherine L., Liang, Yuhe, Lowe, Robert, Peisach, Ezra, Persikova, Irina, Piehl, Dennis W., Rose, Yana, Sali, Andrej, Segura, Joan, Sekharan, Monica, Shao, Chenghua, Vallat, Brinda, Voigt, Maria, Westbrook, John D., Whetstone, Shamara, Young, Jasmine Y., and Zardecki, Christine. RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D. United States: N. p., 2021. Web. doi:10.1002/pro.4213.
Burley, Stephen K., Bhikadiya, Charmi, Bi, Chunxiao, Bittrich, Sebastian, Chen, Li, Crichlow, Gregg V., Duarte, Jose M., Dutta, Shuchismita, Fayazi, Maryam, Feng, Zukang, Flatt, Justin W., Ganesan, Sai J., Goodsell, David S., Ghosh, Sutapa, Kramer Green, Rachel, Guranovic, Vladimir, Henry, Jeremy, Hudson, Brian P., Lawson, Catherine L., Liang, Yuhe, Lowe, Robert, Peisach, Ezra, Persikova, Irina, Piehl, Dennis W., Rose, Yana, Sali, Andrej, Segura, Joan, Sekharan, Monica, Shao, Chenghua, Vallat, Brinda, Voigt, Maria, Westbrook, John D., Whetstone, Shamara, Young, Jasmine Y., & Zardecki, Christine. RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D. United States. https://doi.org/10.1002/pro.4213
Burley, Stephen K., Bhikadiya, Charmi, Bi, Chunxiao, Bittrich, Sebastian, Chen, Li, Crichlow, Gregg V., Duarte, Jose M., Dutta, Shuchismita, Fayazi, Maryam, Feng, Zukang, Flatt, Justin W., Ganesan, Sai J., Goodsell, David S., Ghosh, Sutapa, Kramer Green, Rachel, Guranovic, Vladimir, Henry, Jeremy, Hudson, Brian P., Lawson, Catherine L., Liang, Yuhe, Lowe, Robert, Peisach, Ezra, Persikova, Irina, Piehl, Dennis W., Rose, Yana, Sali, Andrej, Segura, Joan, Sekharan, Monica, Shao, Chenghua, Vallat, Brinda, Voigt, Maria, Westbrook, John D., Whetstone, Shamara, Young, Jasmine Y., and Zardecki, Christine. Fri . "RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D". United States. https://doi.org/10.1002/pro.4213. https://www.osti.gov/servlets/purl/1976373.
@article{osti_1976373,
title = {RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D},
author = {Burley, Stephen K. and Bhikadiya, Charmi and Bi, Chunxiao and Bittrich, Sebastian and Chen, Li and Crichlow, Gregg V. and Duarte, Jose M. and Dutta, Shuchismita and Fayazi, Maryam and Feng, Zukang and Flatt, Justin W. and Ganesan, Sai J. and Goodsell, David S. and Ghosh, Sutapa and Kramer Green, Rachel and Guranovic, Vladimir and Henry, Jeremy and Hudson, Brian P. and Lawson, Catherine L. and Liang, Yuhe and Lowe, Robert and Peisach, Ezra and Persikova, Irina and Piehl, Dennis W. and Rose, Yana and Sali, Andrej and Segura, Joan and Sekharan, Monica and Shao, Chenghua and Vallat, Brinda and Voigt, Maria and Westbrook, John D. and Whetstone, Shamara and Young, Jasmine Y. and Zardecki, Christine},
abstractNote = {Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB‐designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three‐dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro‐electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research‐focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.},
doi = {10.1002/pro.4213},
journal = {Protein Science},
number = 1,
volume = 31,
place = {United States},
year = {Fri Oct 22 00:00:00 EDT 2021},
month = {Fri Oct 22 00:00:00 EDT 2021}
}

Works referenced in this record:

The Protein Data Bank and structural genomics
journal, January 2003


Multivariate Analyses of Quality Metrics for Crystal Structures in the PDB Archive
journal, March 2017


Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures
journal, May 2021

  • Sehnal, David; Bittrich, Sebastian; Deshpande, Mandar
  • Nucleic Acids Research, Vol. 49, Issue W1
  • DOI: 10.1093/nar/gkab314

Reduced neutralization of SARS-CoV-2 B.1.617 by vaccine and convalescent serum
journal, August 2021


TM-align: a protein structure alignment algorithm based on the TM-score
journal, April 2005


The PDB data uniformity project
journal, January 2001


Impact of the Protein Data Bank Across Scientific Disciplines
journal, January 2020

  • Feng, Zukang; Verdiguel, Natalie; Di Costanzo, Luigi
  • Data Science Journal, Vol. 19, Issue 1
  • DOI: 10.5334/dsj-2020-025

GlyTouCan: an accessible glycan structure repository
journal, August 2017

  • Tiemeyer, Michael; Aoki, Kazuhiro; Paulson, James
  • Glycobiology, Vol. 27, Issue 10
  • DOI: 10.1093/glycob/cwx066

The Cambridge Structural Database
journal, April 2016

  • Groom, Colin R.; Bruno, Ian J.; Lightfoot, Matthew P.
  • Acta Crystallographica Section B Structural Science, Crystal Engineering and Materials, Vol. 72, Issue 2, p. 171-179
  • DOI: 10.1107/S2052520616003954

FATCAT: a web server for flexible structure comparison and structure similarity searching
journal, July 2004

  • Ye, Y.; Godzik, A.
  • Nucleic Acids Research, Vol. 32, Issue Web Server
  • DOI: 10.1093/nar/gkh430

Announcing the worldwide Protein Data Bank
journal, December 2003

  • Berman, Helen; Henrick, Kim; Nakamura, Haruki
  • Nature Structural & Molecular Biology, Vol. 10, Issue 12
  • DOI: 10.1038/nsb1203-980

The Resolution Revolution
journal, March 2014


Identification of common molecular subsequences
journal, March 1981


The RCSB Protein Data Bank: redesigned web site and web services
journal, October 2010

  • Rose, P. W.; Beran, B.; Bi, C.
  • Nucleic Acids Research, Vol. 39, Issue Database
  • DOI: 10.1093/nar/gkq1021

Structure of papain-like protease from SARS-CoV-2 and its complexes with non-covalent inhibitors
journal, February 2021


The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures
journal, November 2019

  • Andreeva, Antonina; Kulesha, Eugene; Gough, Julian
  • Nucleic Acids Research, Vol. 48, Issue D1
  • DOI: 10.1093/nar/gkz1064

RCSB Protein Data Bank: Enabling biomedical research and drug discovery
journal, November 2019

  • Goodsell, David S.; Zardecki, Christine; Di Costanzo, Luigi
  • Protein Science, Vol. 29, Issue 1
  • DOI: 10.1002/pro.3730

The international glycan repository GlyTouCan version 3.0
journal, October 2020

  • Fujita, Akihiro; Aoki, Nobuyuki P.; Shinmachi, Daisuke
  • Nucleic Acids Research, Vol. 49, Issue D1
  • DOI: 10.1093/nar/gkaa947

Manual classification strategies in the ECOD database: ECOD Manual Classification Strategies
journal, May 2015

  • Cheng, Hua; Liao, Yuxing; Schaeffer, R. Dustin
  • Proteins: Structure, Function, and Bioinformatics, Vol. 83, Issue 7
  • DOI: 10.1002/prot.24818

The GlyCosmos Portal: a unified and comprehensive web resource for the glycosciences
journal, June 2020


Open-access data: A cornerstone for artificial intelligence approaches to protein structure prediction
journal, June 2021


PubChem in 2021: new data content and improved web interfaces
journal, November 2020

  • Kim, Sunghwan; Chen, Jie; Cheng, Tiejun
  • Nucleic Acids Research, Vol. 49, Issue D1
  • DOI: 10.1093/nar/gkaa971

Remediation of the protein data bank archive
journal, December 2007

  • Henrick, K.; Feng, Z.; Bluhm, W. F.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm937

OneDep: Unified wwPDB System for Deposition, Biocuration, and Validation of Macromolecular Structures in the PDB Archive
journal, March 2017


Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation
journal, February 2020

  • Wrapp, Daniel; Wang, Nianshuang; Corbett, Kizzmekia S.
  • Science, Vol. 367, Issue 6483
  • DOI: 10.1126/science.abb2507

Direct Observation of Protonation State Modulation in SARS-CoV-2 Main Protease upon Inhibitor Binding with Neutron Crystallography
journal, March 2021

  • Kneller, Daniel W.; Phillips, Gwyndalyn; Weiss, Kevin L.
  • Journal of Medicinal Chemistry, Vol. 64, Issue 8
  • DOI: 10.1021/acs.jmedchem.1c00058

Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains
journal, April 2017

  • Yuan, Yuan; Cao, Duanfang; Zhang, Yanfang
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms15092

Retrieval of Crystallographically-Derived Molecular Geometry Information
journal, November 2004

  • Bruno, Ian J.; Cole, Jason C.; Kessler, Magnus
  • Journal of Chemical Information and Computer Sciences, Vol. 44, Issue 6
  • DOI: 10.1021/ci049780b

Outlier analyses of the Protein Data Bank archive using a probability-density-ranking approach
journal, December 2018


Improved methods for building protein models in electron density maps and the location of errors in these models
journal, March 1991

  • Jones, T. A.; Zou, J. Y.; Cowan, S. W.
  • Acta Crystallographica Section A Foundations of Crystallography, Vol. 47, Issue 2, p. 110-119
  • DOI: 10.1107/S0108767390010224

Real-time structural motif searching in proteins using an inverted index strategy
journal, December 2020

  • Bittrich, Sebastian; Burley, Stephen K.; Rose, Alexander S.
  • PLOS Computational Biology, Vol. 16, Issue 12
  • DOI: 10.1371/journal.pcbi.1008502

Modernized uniform representation of carbohydrate molecules in the Protein Data Bank
journal, May 2021


Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration
journal, November 2011

  • Gražulis, Saulius; Daškevič, Adriana; Merkys, Andrius
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr900

The Protein Data Bank: unifying the archive
journal, January 2002


From integrative structural biology to cell biology
journal, January 2021


Detection of circular permutations within protein structures using CE-CP
journal, December 2014


Real time structural search of the Protein Data Bank
journal, July 2020


Analysis of impact metrics for the Protein Data Bank
journal, October 2018

  • Markosian, Christopher; Di Costanzo, Luigi; Sekharan, Monica
  • Scientific Data, Vol. 5, Issue 1
  • DOI: 10.1038/sdata.2018.212

Crystallographic structure of an intact IgG1 monoclonal antibody 1 1Edited by I. A. Wilson
journal, February 1998

  • Harris, Lisa J.; Skaletsky, Eileen; McPherson, Alexander
  • Journal of Molecular Biology, Vol. 275, Issue 5
  • DOI: 10.1006/jmbi.1997.1508

ChEBI in 2016: Improved services and an expanding collection of metabolites
journal, October 2015

  • Hastings, Janna; Owen, Gareth; Dekker, Adriano
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1031

How Structural Biologists and the Protein Data Bank Contributed to Recent FDA New Drug Approvals
journal, February 2019


SAbDab: the structural antibody database
journal, November 2013

  • Dunbar, James; Krawczyk, Konrad; Leem, Jinwoo
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1043

Membrane positioning for high- and low-resolution protein structures through a binary classification approach
journal, December 2015

  • Postic, Guillaume; Ghouzam, Yassine; Guiraud, Vincent
  • Protein Engineering Design and Selection, Vol. 29, Issue 3
  • DOI: 10.1093/protein/gzv063

Serine Protease Mechanism and Specificity
journal, December 2002


RCSB Protein Data Bank: A Resource for Chemical, Biochemical, and Structural Explorations of Large and Small Biomolecules
journal, December 2015

  • Zardecki, Christine; Dutta, Shuchismita; Goodsell, David S.
  • Journal of Chemical Education, Vol. 93, Issue 3
  • DOI: 10.1021/acs.jchemed.5b00404

Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors
journal, April 2020


PDBTM: Protein Data Bank of transmembrane proteins after 8 years
journal, November 2012

  • Kozma, Dániel; Simon, István; Tusnády, Gábor E.
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1169

STAR/mmCIF: An ontology for macromolecular structure
journal, February 2000


The ChEMBL database in 2017
journal, November 2016

  • Gaulton, Anna; Hersey, Anne; Nowotka, Michał
  • Nucleic Acids Research, Vol. 45, Issue D1
  • DOI: 10.1093/nar/gkw1074

Protein structure alignment by incremental combinatorial extension (CE) of the optimal path
journal, September 1998

  • Shindyalov, I. N.; Bourne, P. E.
  • Protein Engineering Design and Selection, Vol. 11, Issue 9
  • DOI: 10.1093/protein/11.9.739

DrugBank 5.0: a major update to the DrugBank database for 2018
journal, November 2017

  • Wishart, David S.; Feunang, Yannick D.; Guo, An C.
  • Nucleic Acids Research, Vol. 46, Issue D1
  • DOI: 10.1093/nar/gkx1037

Impact of structural biologists and the Protein Data Bank on small-molecule drug discovery and development
journal, January 2021


A global coalition to sustain core data
journal, March 2017


Fast determination of the optimal rotational matrix for macromolecular superpositions
journal, January 2009

  • Liu, Pu; Agrafiotis, Dimitris K.; Theobald, Douglas L.
  • Journal of Computational Chemistry
  • DOI: 10.1002/jcc.21439

The distribution and query systems of the RCSB Protein Data Bank
journal, January 2004


BioJava 5: A community driven open-source bioinformatics library
journal, February 2019


MolProbity : all-atom structure validation for macromolecular crystallography
journal, December 2009

  • Chen, Vincent B.; Arendall, W. Bryan; Headd, Jeffrey J.
  • Acta Crystallographica Section D Biological Crystallography, Vol. 66, Issue 1
  • DOI: 10.1107/S0907444909042073

RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy
journal, October 2018

  • Burley, Stephen K.; Berman, Helen M.; Bhikadiya, Charmi
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky1004

Transmembrane proteins in the Protein Data Bank: identification and classification
journal, June 2004


Highly accurate protein structure prediction for the human proteome
journal, July 2021


The MemProtMD database: a resource for membrane-embedded protein structures and their lipid interactions
journal, November 2018

  • Newport, Thomas D.; Sansom, Mark S. P.; Stansfeld, Phillip J.
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky1047

Crystallography: Protein Data Bank
journal, October 1971


Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein
journal, April 2020


Validation of Structures in the Protein Data Bank
journal, December 2017


Structural impact on SARS-CoV-2 spike protein by D614G substitution
journal, April 2021


Structural model of the SARS coronavirus E channel in LMPG micelles
journal, June 2018

  • Surya, Wahyu; Li, Yan; Torres, Jaume
  • Biochimica et Biophysica Acta (BBA) - Biomembranes, Vol. 1860, Issue 6
  • DOI: 10.1016/j.bbamem.2018.02.017

The Protein Data Bank
journal, January 2000


Protein Data Bank: the single global archive for 3D macromolecular structure data
journal, October 2018

  • Burley, Stephen K.; Berman, Helen M.; Bhikadiya, Charmi
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky949

GlyGen: Computational and Informatics Resources for Glycoscience
journal, October 2019

  • York, William S.; Mazumder, Raja; Ranzinger, Rene
  • Glycobiology, Vol. 30, Issue 2
  • DOI: 10.1093/glycob/cwz080

Impact of the Protein Data Bank on antineoplastic approvals
journal, May 2020


OPM: Orientations of Proteins in Membranes database
journal, January 2006


How to help the free market fight coronavirus
journal, March 2020


Between objectivity and subjectivity
journal, February 1990

  • Bränd´en, Carl-Ivar; Alwyn Jones, T.
  • Nature, Vol. 343, Issue 6260
  • DOI: 10.1038/343687a0

Landscape of Innovation for Cardiovascular Pharmaceuticals: From Basic Science to New Molecular Entities
journal, July 2017


Investigation of protein quaternary structure via stoichiometry and symmetry ınformation
journal, June 2018


RCSB Protein Data Bank 1D tools and services
journal, December 2020


Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data
journal, January 2018


Responsible Data Science
journal, June 2017

  • van der Aalst, Wil M. P.; Bichler, Martin; Heinzl, Armin
  • Business & Information Systems Engineering, Vol. 59, Issue 5
  • DOI: 10.1007/s12599-017-0487-z

Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank
journal, April 2021


PDB ‐101: Educational resources supporting molecular explorations through biology and medicine
journal, October 2021

  • Zardecki, Christine; Dutta, Shuchismita; Goodsell, David S.
  • Protein Science, Vol. 31, Issue 1
  • DOI: 10.1002/pro.4200

Inhibition of histone deacetylase 1 (HDAC1) and HDAC2 enhances CRISPR/Cas9 genome editing
journal, December 2019

  • Liu, Bin; Chen, Siwei; Rose, Anouk La
  • Nucleic Acids Research, Vol. 48, Issue 2
  • DOI: 10.1093/nar/gkz1136

Simplified quality assessment for small-molecule ligands in the Protein Data Bank
journal, February 2022