Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
- Oak Ridge Inst. for Science and Education (ORISE), Durham, NC (United States). Environmental Protection Agency; US Environmental Protection Agency (EPA), Research Triangle Park, NC (United States). Office of Research and Development. National Center for Computational Toxicology
- CSRA Inc., Research Triangle Park. Durham, NC (United States)
- GDIT, Research Triangle Park, Durham, NC (United States)
- Oak Ridge Associated Univ., Durham, NC (United States)
- US Environmental Protection Agency (EPA), Research Triangle Park, NC (United States). Office of Research and Development. National Center for Computational Toxicology
- US Environmental Protection Agency (EPA), Research Triangle Park, NC (United States). Office of Research and Development. National Exposure Research Lab.
Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS2) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA’s CompTox Chemicals Dashboard.
- Research Organization:
- Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- Grant/Contract Number:
- SC0014664
- OSTI ID:
- 1624275
- Journal Information:
- Scientific Data, Vol. 6, Issue 1; ISSN 2052-4463
- Publisher:
- Nature Publishing GroupCopyright Statement
- Country of Publication:
- United States
- Language:
- English
In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples
|
journal | January 2020 |
CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
|
journal | April 2019 |
Similar Records
Revisiting Five Years of CASMI Contests with EPA Identification Tools
The CompTox Chemistry Dashboard: a community data resource for environmental chemistry