The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data
Abstract
Abstract The Molecular Sciences Software Institute's (MolSSI) Quantum Chemistry Archive (QCA rchive ) project is an umbrella name that covers both a central server hosted by MolSSI for community data and the Python‐based software infrastructure that powers automated computation and storage of quantum chemistry (QC) results. The MolSSI‐hosted central server provides the computational molecular sciences community a location to freely access tens of millions of QC computations for machine learning, methodology assessment, force‐field fitting, and more through a Python interface. Facile, user‐friendly mining of the centrally archived quantum chemical data also can be achieved through web applications found at https://qcarchive.molssi.org . The software infrastructure can be used as a standalone platform to compute, structure, and distribute hundreds of millions of QC computations for individuals or groups of researchers at any scale. The QCA rchive I nfrastructure is open‐source (BSD‐3C), code repositories can be found at https://github.com/MolSSI , and releases can be downloaded via PyPI and Conda. This article is categorized under: Electronic Structure Theory > Ab Initio Electronic Structure Methods Software > Quantum Chemistry Data Science > Computer Algorithms and Programming
- Authors:
-
- Molecular Sciences Software Inst., Blacksburg, VA (United States)
- Molecular Sciences Software Inst., Blacksburg, VA (United States); Alexandria Univ. (Egypt)
- Georgia Institute of Technology, Atlanta, GA (United States). Center for Computational Molecular Science and Technology
- Argonne National Lab. (ANL), Lemont, IL (United States)
- Molecular Sciences Software Inst., Blacksburg, VA (United States); Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- National Science Foundation (NSF); USDOE Exascale Computing Project; USDOE Office of Science (SC); USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1774122
- Alternate Identifier(s):
- OSTI ID: 1644230
- Grant/Contract Number:
- AC02-06CH11357; 1449723; 1547580; 17‐SC‐20‐SC
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Wiley Interdisciplinary Reviews: Computational Molecular Science
- Additional Journal Information:
- Journal Volume: 11; Journal Issue: 2; Journal ID: ISSN 1759-0876
- Publisher:
- Wiley
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; databases; density functional theory; high-throughput computing; machine learning; quantum chemistry
Citation Formats
Smith, Daniel A., Altarawy, Doaa, Burns, Lori A., Welborn, Matthew, Naden, Levi N., Ward, Logan, Ellis, Sam, Pritchard, Benjamin P., and Crawford, T. Daniel. The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data. United States: N. p., 2020.
Web. doi:10.1002/wcms.1491.
Smith, Daniel A., Altarawy, Doaa, Burns, Lori A., Welborn, Matthew, Naden, Levi N., Ward, Logan, Ellis, Sam, Pritchard, Benjamin P., & Crawford, T. Daniel. The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data. United States. https://doi.org/10.1002/wcms.1491
Smith, Daniel A., Altarawy, Doaa, Burns, Lori A., Welborn, Matthew, Naden, Levi N., Ward, Logan, Ellis, Sam, Pritchard, Benjamin P., and Crawford, T. Daniel. Fri .
"The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data". United States. https://doi.org/10.1002/wcms.1491. https://www.osti.gov/servlets/purl/1774122.
@article{osti_1774122,
title = {The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data},
author = {Smith, Daniel A. and Altarawy, Doaa and Burns, Lori A. and Welborn, Matthew and Naden, Levi N. and Ward, Logan and Ellis, Sam and Pritchard, Benjamin P. and Crawford, T. Daniel},
abstractNote = {Abstract The Molecular Sciences Software Institute's (MolSSI) Quantum Chemistry Archive (QCA rchive ) project is an umbrella name that covers both a central server hosted by MolSSI for community data and the Python‐based software infrastructure that powers automated computation and storage of quantum chemistry (QC) results. The MolSSI‐hosted central server provides the computational molecular sciences community a location to freely access tens of millions of QC computations for machine learning, methodology assessment, force‐field fitting, and more through a Python interface. Facile, user‐friendly mining of the centrally archived quantum chemical data also can be achieved through web applications found at https://qcarchive.molssi.org . The software infrastructure can be used as a standalone platform to compute, structure, and distribute hundreds of millions of QC computations for individuals or groups of researchers at any scale. The QCA rchive I nfrastructure is open‐source (BSD‐3C), code repositories can be found at https://github.com/MolSSI , and releases can be downloaded via PyPI and Conda. This article is categorized under: Electronic Structure Theory > Ab Initio Electronic Structure Methods Software > Quantum Chemistry Data Science > Computer Algorithms and Programming},
doi = {10.1002/wcms.1491},
journal = {Wiley Interdisciplinary Reviews: Computational Molecular Science},
number = 2,
volume = 11,
place = {United States},
year = {Fri Jul 31 00:00:00 EDT 2020},
month = {Fri Jul 31 00:00:00 EDT 2020}
}
Web of Science
Works referenced in this record:
Geometry optimization made simple with translation and rotation coordinates
journal, June 2016
- Wang, Lee-Ping; Song, Chenchen
- The Journal of Chemical Physics, Vol. 144, Issue 21
Recent developments in the general atomic and molecular electronic structure system
journal, April 2020
- Barca, Giuseppe M. J.; Bertoni, Colleen; Carrington, Laura
- The Journal of Chemical Physics, Vol. 152, Issue 15
Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy
journal, January 2005
- Weigend, Florian; Ahlrichs, Reinhart
- Physical Chemistry Chemical Physics, Vol. 7, Issue 18, p. 3297-3305
Psi4 1.1: An Open-Source Electronic Structure Program Emphasizing Automation, Advanced Libraries, and Interoperability
journal, June 2017
- Parrish, Robert M.; Burns, Lori A.; Smith, Daniel G. A.
- Journal of Chemical Theory and Computation, Vol. 13, Issue 7
Density‐functional thermochemistry. III. The role of exact exchange
journal, April 1993
- Becke, Axel D.
- The Journal of Chemical Physics, Vol. 98, Issue 7, p. 5648-5652
Turbomole
journal, July 2013
- Furche, Filipp; Ahlrichs, Reinhart; Hättig, Christof
- Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 4, Issue 2
Accurate Noncovalent Interactions via Dispersion-Corrected Second-Order Møller–Plesset Perturbation Theory
journal, August 2018
- Řezáč, Jan; Greenwell, Chandler; Beran, Gregory J. O.
- Journal of Chemical Theory and Computation, Vol. 14, Issue 9
ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
journal, January 2017
- Smith, J. S.; Isayev, O.; Roitberg, A. E.
- Chemical Science, Vol. 8, Issue 4
Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density
journal, January 1988
- Lee, Chengteh; Yang, Weitao; Parr, Robert G.
- Physical Review B, Vol. 37, Issue 2
Making the most of materials computations
journal, October 2016
- Thygesen, K. S.; Jacobsen, K. W.
- Science, Vol. 354, Issue 6309
NWChem: Past, present, and future
journal, May 2020
- Aprà, E.; Bylaska, E. J.; de Jong, W. A.
- The Journal of Chemical Physics, Vol. 152, Issue 18
NGLview–interactive molecular graphics for Jupyter notebooks
journal, December 2017
- Nguyen, Hai; Case, David A.; Rose, Alexander S.
- Bioinformatics, Vol. 34, Issue 7
Data Structures for Statistical Computing in Python
conference, January 2010
- McKinney, Wes
- Proceedings of the Python in Science Conference
Molpro: a general-purpose quantum chemistry program package: Molpro
journal, July 2011
- Werner, Hans-Joachim; Knowles, Peter J.; Knizia, Gerald
- Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 2, Issue 2
ACCDB: A collection of chemistry databases for broad computational purposes: ACCDB: A Collection of Chemistry DataBases for Broad Computational Purposes
journal, December 2018
- Morgante, Pierpaolo; Peverati, Roberto
- Journal of Computational Chemistry, Vol. 40, Issue 6
Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics
journal, August 2009
- Ufimtsev, Ivan S.; Martinez, Todd J.
- Journal of Chemical Theory and Computation, Vol. 5, Issue 10
FireWorks: a dynamic workflow system designed for high-throughput applications: FireWorks: A Dynamic Workflow System Designed for High-Throughput Applications
journal, May 2015
- Jain, Anubhav; Ong, Shyue Ping; Chen, Wei
- Concurrency and Computation: Practice and Experience, Vol. 27, Issue 17
Basis set convergence of the coupled-cluster correction, δMP2CCSD(T): Best practices for benchmarking non-covalent interactions and the attendant revision of the S22, NBC10, HBC6, and HSG databases
journal, November 2011
- Marshall, Michael S.; Burns, Lori A.; Sherrill, C. David
- The Journal of Chemical Physics, Vol. 135, Issue 19
NOMAD: The FAIR concept for big data-driven materials science
journal, September 2018
- Draxl, Claudia; Scheffler, Matthias
- MRS Bulletin, Vol. 43, Issue 9
Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs
journal, January 2006
- Jurečka, Petr; Šponer, Jiří; Černý, Jiří
- Physical Chemistry Chemical Physics, Vol. 8, Issue 17, p. 1985-1993
Dask: Parallel Computation with Blocked algorithms and Task Scheduling
conference, January 2015
- Rocklin, Matthew
- Proceedings of the Python in Science Conference
ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules
journal, December 2017
- Smith, Justin S.; Isayev, Olexandr; Roitberg, Adrian E.
- Scientific Data, Vol. 4, Issue 1
The Science Gateways Community Institute at Two Years
conference, July 2018
- Wilkins-Diehr, Nancy; Zentner, Michael; Pierce, Marlon
- PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing
Driving torsion scans with wavefront propagation
journal, June 2020
- Qiu, Yudong; Smith, Daniel G. A.; Stern, Chaya D.
- The Journal of Chemical Physics, Vol. 152, Issue 24
Less is more: Sampling chemical space with active learning
journal, June 2018
- Smith, Justin S.; Nebgen, Ben; Lubbers, Nicholas
- The Journal of Chemical Physics, Vol. 148, Issue 24
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013
- Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
- APL Materials, Vol. 1, Issue 1
The FAIR Guiding Principles for scientific data management and stewardship
journal, March 2016
- Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan
- Scientific Data, Vol. 3, Issue 1
P si4 1.4: Open-source software for high-throughput quantum chemistry
journal, May 2020
- Smith, Daniel G. A.; Burns, Lori A.; Simmonett, Andrew C.
- The Journal of Chemical Physics, Vol. 152, Issue 18
Benchmark Database of Barrier Heights for Heavy Atom Transfer, Nucleophilic Substitution, Association, and Unimolecular Reactions and Its Use to Test Theoretical Methods
journal, March 2005
- Zhao, Yan; González-García, Núria; Truhlar, Donald G.
- The Journal of Physical Chemistry A, Vol. 109, Issue 9
The BioFragment Database (BFDb): An open-data platform for computational chemistry analysis of noncovalent interactions
journal, October 2017
- Burns, Lori A.; Faver, John C.; Zheng, Zheng
- The Journal of Chemical Physics, Vol. 147, Issue 16
Binder 2.0 - Reproducible, interactive, sharable environments for science at scale
conference, January 2018
- Jupyter, Project; Bussonnier, Matthias; Forde, Jessica
- Proceedings of the Python in Science Conference
Parsl: Pervasive Parallel Programming in Python
conference, January 2019
- Babuji, Yadu; Foster, Ian; Wilde, Michael
- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '19
Building Blocks for Workflow System Middleware
conference, May 2018
- Turilli, Matteo; Merzky, Andre; Balasubramanian, Vivek
- 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
Quantum-chemical insights from deep tensor neural networks
journal, January 2017
- Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
- Nature Communications, Vol. 8, Issue 1
Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94
journal, April 1996
- Halgren, Thomas A.
- Journal of Computational Chemistry, Vol. 17, Issue 5-6
PubChemQC Project: A Large-Scale First-Principles Electronic Structure Database for Data-Driven Chemistry
journal, May 2017
- Nakata, Maho; Shimazaki, Tomomi
- Journal of Chemical Information and Modeling, Vol. 57, Issue 6
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package
journal, September 2014
- Shao, Yihan; Gan, Zhengting; Epifanovsky, Evgeny
- Molecular Physics, Vol. 113, Issue 2
Dataset’s chemical diversity limits the generalizability of machine learning predictions
journal, November 2019
- Glavatskikh, Marta; Leguy, Jules; Hunault, Gilles
- Journal of Cheminformatics, Vol. 11, Issue 1
Performance of Ab Initio and Density Functional Methods for Conformational Equilibria of C n H 2 n +2 Alkane Isomers ( n = 4−8) †
journal, October 2009
- Gruzman, David; Karton, Amir; Martin, Jan M. L.
- The Journal of Physical Chemistry A, Vol. 113, Issue 43
PubChem 2019 update: improved access to chemical data
journal, October 2018
- Kim, Sunghwan; Chen, Jie; Cheng, Tiejun
- Nucleic Acids Research, Vol. 47, Issue D1
Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014
- Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
- Scientific Data, Vol. 1, Issue 1
SLURM: Simple Linux Utility for Resource Management
book, January 2003
- Yoo, Andy B.; Jette, Morris A.; Grondona, Mark
- Job Scheduling Strategies for Parallel Processing
Formal Estimation of Errors in Computed Absolute Interaction Energies of Protein−Ligand Complexes
journal, February 2011
- Faver, John C.; Benson, Mark L.; He, Xiao
- Journal of Chemical Theory and Computation, Vol. 7, Issue 3
OpenMM 7: Rapid development of high performance algorithms for molecular dynamics
journal, July 2017
- Eastman, Peter; Swails, Jason; Chodera, John D.
- PLOS Computational Biology, Vol. 13, Issue 7
A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu
journal, April 2010
- Grimme, Stefan; Antony, Jens; Ehrlich, Stephan
- The Journal of Chemical Physics, Vol. 132, Issue 15