skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Active Learning Framework

Software ·
DOI:https://doi.org/10.11578/dc.20231103.4· OSTI ID:2204974 · Code ID:115537

Machine learning (ML) of interatomic potentials show great promise to accelerate scientific simulation, e.g., by emulating expensive computations at a high accuracy but much reduced computational cost. Training datasets are calculated from computationally expensive ab initio quantum mechanics methods, density functional theory (DFT). Trained on this data, an ML model can be very successful in predicting energy and forces for new atomic configurations. A critical factor is the quality and diversity of the training dataset. Thus, a highly automated approach to dataset construction based on active learning framework is designed suitable for material physics. The active learning scheme begins with fully randomized atomic configurations. Then, many Molecular Dynamics (MD) trajectories are simulated using current ML potentials, where each MD trajectory is initialized to a random disordered configuration. The temperature is varied in order to diversify the sampled configuration during these simulations. The variance of predictions for eight neural networks within an ensemble is analyzed to determine whether the model is operating as expected. This helps in determining whether collecting more data would be helpful to the model by checking the ensemble variance is greater than the threshold. In this case, the MD trajectory is terminated and the final atomic configuration is placed on a queue (SQL database) for DFT calculations and added to training dataset. Periodically, ML model is retrained to the updated training model. This Active Learning loop is iterated until the cost of MD simulations becomes prohibitively expensive. The MD simulations will hopefully be sufficiently robust to support nucleation after many active learning iterations. In this sense, active learning scheme must automatically discover the important low energy and nonequilibrium physics.

Project Type:
Open Source, Publicly Available Repository
Site Accession Number:
C22072
Software Type:
Scientific
License(s):
BSD 3-clause "New" or "Revised" License
Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program

Primary Award/Contract Number:
AC52-06NA25396
DOE Contract Number:
AC52-06NA25396
Code ID:
115537
OSTI ID:
2204974
Country of Origin:
United States

Similar Records

Active Learning A Neural Network Model For Gold Clusters & Bulk From Sparse First Principles Training Data
Journal Article · Wed Jun 10 00:00:00 EDT 2020 · ChemCatChem · OSTI ID:2204974

Automated discovery of a robust interatomic potential for aluminum
Journal Article · Tue Feb 23 00:00:00 EST 2021 · Nature Communications · OSTI ID:2204974

Robust training of machine learning interatomic potentials with dimensionality reduction and stratified sampling
Journal Article · Mon Feb 26 00:00:00 EST 2024 · npj Computational Materials · OSTI ID:2204974

Related Subjects