skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using support vector machines to improve elemental ion identification in macromolecular crystal structures

Abstract

In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of themore » chemical environment and X-ray scattering.« less

Authors:
 [1];  [2];  [3]
  1. Univ. of California, Berkeley, CA (United States). College of Letters and Science; Lawrence Berkeley National Lab., Berkeley, CA (United States). Physical Biosciences Div.
  2. Lawrence Berkeley National Lab., Berkeley, CA (United States). Physical Biosciences Div.
  3. Lawrence Berkeley National Lab., Berkeley, CA (United States). Physical Biosciences Div.; Univ. of California, Berkeley, CA (United States). Dept. og Bioengineering.
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1213439
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Acta Crystallographica. Section D: Biological Crystallography (Online)
Additional Journal Information:
Journal Volume: 71; Journal Issue: 5; Journal ID: ISSN 1399-0047
Publisher:
International Union of Crystallography
Country of Publication:
United States
Language:
English
Subject:
36 MATERIALS SCIENCE; elemental ion identification; support vector machines; model building

Citation Formats

Morshed, Nader, Echols, Nathaniel, and Adams, Paul D. Using support vector machines to improve elemental ion identification in macromolecular crystal structures. United States: N. p., 2015. Web. doi:10.1107/S1399004715004241.
Morshed, Nader, Echols, Nathaniel, & Adams, Paul D. Using support vector machines to improve elemental ion identification in macromolecular crystal structures. United States. https://doi.org/10.1107/S1399004715004241
Morshed, Nader, Echols, Nathaniel, and Adams, Paul D. 2015. "Using support vector machines to improve elemental ion identification in macromolecular crystal structures". United States. https://doi.org/10.1107/S1399004715004241. https://www.osti.gov/servlets/purl/1213439.
@article{osti_1213439,
title = {Using support vector machines to improve elemental ion identification in macromolecular crystal structures},
author = {Morshed, Nader and Echols, Nathaniel and Adams, Paul D.},
abstractNote = {In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.},
doi = {10.1107/S1399004715004241},
url = {https://www.osti.gov/biblio/1213439}, journal = {Acta Crystallographica. Section D: Biological Crystallography (Online)},
issn = {1399-0047},
number = 5,
volume = 71,
place = {United States},
year = {Sat Apr 25 00:00:00 EDT 2015},
month = {Sat Apr 25 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 4 works
Citation information provided by
Web of Science

Save / Share: