Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

SOMAS: a platform for data-driven material discovery in redox flow battery development

Journal Article · · Scientific Data
Abstract

Aqueous organic redox flow batteries offer an environmentally benign, tunable, and safe route to large-scale energy storage. The energy density is one of the key performance parameters of organic redox flow batteries, which critically depends on the solubility of the redox-active molecule in water. Prediction of aqueous solubility remains a challenge in chemistry. Recently, machine learning models have been developed for molecular properties prediction in chemistry and material science. The fidelity of a machine learning model critically depends on the diversity, accuracy, and abundancy of the training datasets. We build a comprehensive open access organic molecular database “Solubility of Organic Molecules in Aqueous Solution” (SOMAS) containing about 12,000 molecules that covers wider chemical and solubility regimes suitable for aqueous organic redox flow battery development efforts. In addition to experimental solubility, we also provide eight distinctive quantum descriptors including optimized geometry derived from high-throughput density functional theory calculations along with six molecular descriptors for each molecule. SOMAS builds a critical foundation for future efforts in artificial intelligence-based solubility prediction models.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE; USDOE Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
AC05-76RL01830
OSTI ID:
1901485
Alternate ID(s):
OSTI ID: 1901767
Report Number(s):
PNNL-SA-161978; 740; PII: 1814
Journal Information:
Scientific Data, Journal Name: Scientific Data Journal Issue: 1 Vol. 9; ISSN 2052-4463
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (46)

Predicting Melting Points of Organic Molecules: Applications to Aqueous Solubility Prediction Using the General Solubility Equation journal July 2015
Molecular modeling of organic redox‐active battery materials journal July 2020
Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information journal June 2011
Aqueous organic redox flow batteries journal March 2019
Estimation of aqueous solubility of organic compounds by using the general solubility equation journal August 2002
Prediction of drug solubility from structure journal March 2002
Aqueous organic and redox-mediated redox flow batteries: a review journal June 2020
Pushing the limits of solubility prediction via quality-oriented data selection journal January 2021
Revisiting Fluorescent Calixarenes: From Molecular Sensors to Smart Materials journal July 2019
Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation journal November 2015
MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes journal April 2017
Identifying Structure–Property Relationships through SMILES Syntax Analysis with Self-Attention Mechanism journal January 2019
Robust and Efficient Implicit Solvation Model for Fast Semiempirical Methods journal June 2021
Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations journal April 2019
Water-Mediated Heterogeneously Catalyzed Reactions journal December 2019
Status and Prospects of Organic Redox Flow Batteries toward Sustainable Energy Storage journal August 2019
Prediction of Drug Solubility by the General Solubility Equation (GSE) journal January 2001
Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 journal November 2012
Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology journal February 2000
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13 journal July 2009
Robust and Affordable Multicoefficient Methods for Thermochemistry and Thermochemical Kinetics:  The MCCM/3 Suite and SAC/3 journal May 2003
Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions
  • Marenich, Aleksandr V.; Cramer, Christopher J.; Truhlar, Donald G.
  • The Journal of Physical Chemistry B, Vol. 113, Issue 18, p. 6378-6396 https://doi.org/10.1021/jp810292n
journal May 2009
The Rule of Five Revisited:  Applying Log D in Place of Log P in Drug-Likeness Filters journal August 2007
Aqueous Solubility Prediction: Do Crystal Lattice Interactions Help? journal June 2013
Machine learning with physicochemical relationships: solubility prediction in organic solvents and water journal November 2020
AqSolDB, a curated reference set of aqueous solubility and 2D descriptors for a diverse set of compounds journal August 2019
COSMO Implementation in TURBOMOLE: Extension of an efficient quantum chemical code towards liquid systems journal January 2000
Review of electrical energy storage technologies, materials and systems: challenges and prospects for large-scale grid storage journal January 2018
Automated exploration of the low-energy chemical space with fast quantum chemical methods journal January 2020
COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient journal January 1993
Self‐Consistent Molecular‐Orbital Methods. IX. An Extended Gaussian‐Type Basis for Molecular‐Orbital Studies of Organic Molecules journal January 1971
Self—Consistent Molecular Orbital Methods. XII. Further Extensions of Gaussian—Type Basis Sets for Use in Molecular Orbital Studies of Organic Molecules journal March 1972
A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu journal April 2010
Self‐consistent molecular orbital methods. XXIII. A polarization‐type basis set for second‐row elements journal October 1982
Rationale for mixing exact exchange with density functional approximations journal December 1996
Toward reliable density functional methods without adjustable parameters: The PBE0 model journal April 1999
NWChem: Past, present, and future journal May 2020
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package journal September 2014
Pseudopotentials for main group elements (IIIa through VIIa) journal December 1988
The Many Roles of Computation in Drug Discovery journal March 2004
Open Babel: An open chemical toolbox journal October 2011
InChIKey collision resistance: an experimental testing journal December 2012
Mordred: a molecular descriptor calculator journal February 2018
Improved Prediction of Aqueous Solubility of Novel Compounds by Going Deeper With Deep Learning journal February 2020
Editorial journal January 2016
Somas dataset January 2022

Similar Records

Advancing energy storage through solubility prediction: leveraging the potential of deep learning
Journal Article · Tue Nov 14 19:00:00 EST 2023 · Physical Chemistry Chemical Physics. PCCP · OSTI ID:2217095

Multi-objective goal-directed optimization of de novo stable organic radicals for aqueous redox flow batteries
Journal Article · Wed Aug 03 20:00:00 EDT 2022 · Nature Machine Intelligence · OSTI ID:1879783