skip to main content

DOE PAGESDOE PAGES

Title: SIMBAD: a sequence-independent molecular-replacement pipeline

The conventional approach to finding structurally similar search models for use in molecular replacement (MR) is to use the sequence of the target to search against those of a set of known structures. Sequence similarity often correlates with structure similarity. Given sufficient similarity, a known structure correctly positioned in the target cell by the MR process can provide an approximation to the unknown phases of the target. An alternative approach to identifying homologous structures suitable for MR is to exploit the measured data directly, comparing the lattice parameters or the experimentally derived structure-factor amplitudes with those of known structures. Here, SIMBAD , a new sequence-independent MR pipeline which implements these approaches, is presented. SIMBAD can identify cases of contaminant crystallization and other mishaps such as mistaken identity (swapped crystallization trays), as well as solving unsequenced targets and providing a brute-force approach where sequence-dependent search-model identification may be nontrivial, for example because of conformational diversity among identifiable homologues. The program implements a three-step pipeline to efficiently identify a suitable search model in a database of known structures. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether or not a homologue exists in themore » same crystal form. The second step is designed to screen the target data for the presence of a crystallized contaminant, a not uncommon occurrence in macromolecular crystallography. Solving structures with MR in such cases can remain problematic for many years, since the search models, which are assumed to be similar to the structure of interest, are not necessarily related to the structures that have actually crystallized. To cater for this eventuality, SIMBAD rapidly screens the data against a database of known contaminant structures. Where the first two steps fail to yield a solution, a final step in SIMBAD can be invoked to perform a brute-force search of a nonredundant PDB database provided by the MoRDa MR software. Through early-access usage of SIMBAD , this approach has solved novel cases that have otherwise proved difficult to solve.« less
Authors:
 [1] ; ORCiD logo [2] ; ORCiD logo [2] ;  [3] ;  [4] ;  [4] ;  [4] ; ORCiD logo [5] ;  [6] ;  [7] ;  [8] ; ORCiD logo [9] ; ORCiD logo [10] ;  [3] ; ORCiD logo [2] ;  [11]
  1. Univ. of Liverpool, Liverpool (England); Synchrotron SOLEIL, Gif-sur-Yvette (France)
  2. Univ. of Liverpool, Liverpool (England)
  3. Synchrotron SOLEIL, Gif-sur-Yvette (France)
  4. STFC, Rutherford Appleton Lab., Didcot (England)
  5. STFC, Rutherford Appleton Lab., Didcot (England); Global Phasing Ltd, Cambridge (England)
  6. Weill Cornell Medicine, New York, NY (United States)
  7. Argonne National Lab. (ANL), Lemont, IL (United States)
  8. Walter and Eliza Hall Institute of Medical Research, Parkville, VIC (Australia); Univ. of Melbourne, VIC (Australia)
  9. Institut Pasteur de Montevideo, Montevideo (Uruguay); Instituto de Biologia Molecular y Celular de Rosario (IBR, CONICET-UNR), Rosario (Argentina)
  10. Institut Pasteur de Montevideo, Montevideo (Uruguay)
  11. Univ. of Liverpool, Liverpool (England); STFC, Rutherford Appleton Lab., Didcot (England)
Publication Date:
Grant/Contract Number:
AC02-06CH11357
Type:
Published Article
Journal Name:
Acta Crystallographica. Section D. Structural Biology
Additional Journal Information:
Journal Volume: 74; Journal Issue: 7; Journal ID: ISSN 2059-7983
Publisher:
IUCr
Research Org:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org:
National Institutes of Health (NIH), National Cancer Institute; National Institutes of Health (NIH), National Institute of General Medical Sciences; USDOE
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; SIMBAD; contaminant; lattice search; molecular replacement pipeline; structure solution
OSTI Identifier:
1441079
Alternate Identifier(s):
OSTI ID: 1481750

Simpkin, Adam J., Simkovic, Felix, Thomas, Jens M. H., Savko, Martin, Lebedev, Andrey, Uski, Ville, Ballard, Charles, Wojdyr, Marcin, Wu, Rui, Sanishvili, Ruslan, Xu, Yibin, Lisa, María -Natalia, Buschiazzo, Alejandro, Shepard, William, Rigden, Daniel J., and Keegan, Ronan M.. SIMBAD: a sequence-independent molecular-replacement pipeline. United States: N. p., Web. doi:10.1107/S2059798318005752.
Simpkin, Adam J., Simkovic, Felix, Thomas, Jens M. H., Savko, Martin, Lebedev, Andrey, Uski, Ville, Ballard, Charles, Wojdyr, Marcin, Wu, Rui, Sanishvili, Ruslan, Xu, Yibin, Lisa, María -Natalia, Buschiazzo, Alejandro, Shepard, William, Rigden, Daniel J., & Keegan, Ronan M.. SIMBAD: a sequence-independent molecular-replacement pipeline. United States. doi:10.1107/S2059798318005752.
Simpkin, Adam J., Simkovic, Felix, Thomas, Jens M. H., Savko, Martin, Lebedev, Andrey, Uski, Ville, Ballard, Charles, Wojdyr, Marcin, Wu, Rui, Sanishvili, Ruslan, Xu, Yibin, Lisa, María -Natalia, Buschiazzo, Alejandro, Shepard, William, Rigden, Daniel J., and Keegan, Ronan M.. 2018. "SIMBAD: a sequence-independent molecular-replacement pipeline". United States. doi:10.1107/S2059798318005752.
@article{osti_1441079,
title = {SIMBAD: a sequence-independent molecular-replacement pipeline},
author = {Simpkin, Adam J. and Simkovic, Felix and Thomas, Jens M. H. and Savko, Martin and Lebedev, Andrey and Uski, Ville and Ballard, Charles and Wojdyr, Marcin and Wu, Rui and Sanishvili, Ruslan and Xu, Yibin and Lisa, María -Natalia and Buschiazzo, Alejandro and Shepard, William and Rigden, Daniel J. and Keegan, Ronan M.},
abstractNote = {The conventional approach to finding structurally similar search models for use in molecular replacement (MR) is to use the sequence of the target to search against those of a set of known structures. Sequence similarity often correlates with structure similarity. Given sufficient similarity, a known structure correctly positioned in the target cell by the MR process can provide an approximation to the unknown phases of the target. An alternative approach to identifying homologous structures suitable for MR is to exploit the measured data directly, comparing the lattice parameters or the experimentally derived structure-factor amplitudes with those of known structures. Here, SIMBAD , a new sequence-independent MR pipeline which implements these approaches, is presented. SIMBAD can identify cases of contaminant crystallization and other mishaps such as mistaken identity (swapped crystallization trays), as well as solving unsequenced targets and providing a brute-force approach where sequence-dependent search-model identification may be nontrivial, for example because of conformational diversity among identifiable homologues. The program implements a three-step pipeline to efficiently identify a suitable search model in a database of known structures. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether or not a homologue exists in the same crystal form. The second step is designed to screen the target data for the presence of a crystallized contaminant, a not uncommon occurrence in macromolecular crystallography. Solving structures with MR in such cases can remain problematic for many years, since the search models, which are assumed to be similar to the structure of interest, are not necessarily related to the structures that have actually crystallized. To cater for this eventuality, SIMBAD rapidly screens the data against a database of known contaminant structures. Where the first two steps fail to yield a solution, a final step in SIMBAD can be invoked to perform a brute-force search of a nonredundant PDB database provided by the MoRDa MR software. Through early-access usage of SIMBAD , this approach has solved novel cases that have otherwise proved difficult to solve.},
doi = {10.1107/S2059798318005752},
journal = {Acta Crystallographica. Section D. Structural Biology},
number = 7,
volume = 74,
place = {United States},
year = {2018},
month = {6}
}

Works referenced in this record:

PHENIX: a comprehensive Python-based system for macromolecular structure solution
journal, January 2010
  • Adams, Paul D.; Afonine, Pavel V.; Bunk�czi, G�bor
  • Acta Crystallographica Section D Biological Crystallography, Vol. 66, Issue 2, p. 213-221
  • DOI: 10.1107/S0907444909052925