Active Learning Framework

Nebgen, Benjamin; Smith, Justin; Lubbers, Nicholas; Barros, Kipton; Li, Yin Wai; Li, Wei

doi:10.11578/dc.20231103.4

Active Learning Framework

Software · Tue Jan 03 19:00:00 EST 2023

DOI:https://doi.org/10.11578/dc.20231103.4· OSTI ID:code-115537 · Code ID:115537

Nebgen, Benjamin; Smith, Justin; Lubbers, Nicholas; Barros, Kipton; Li, Yin Wai; Li, Wei

Machine learning (ML) of interatomic potentials show great promise to accelerate scientific simulation, e.g., by emulating expensive computations at a high accuracy but much reduced computational cost. Training datasets are calculated from computationally expensive ab initio quantum mechanics methods, density functional theory (DFT). Trained on this data, an ML model can be very successful in predicting energy and forces for new atomic configurations. A critical factor is the quality and diversity of the training dataset. Thus, a highly automated approach to dataset construction based on active learning framework is designed suitable for material physics. The active learning scheme begins with fully randomized atomic configurations. Then, many Molecular Dynamics (MD) trajectories are simulated using current ML potentials, where each MD trajectory is initialized to a random disordered configuration. The temperature is varied in order to diversify the sampled configuration during these simulations. The variance of predictions for eight neural networks within an ensemble is analyzed to determine whether the model is operating as expected. This helps in determining whether collecting more data would be helpful to the model by checking the ensemble variance is greater than the threshold. In this case, the MD trajectory is terminated and the final atomic configuration is placed on a queue (SQL database) for DFT calculations and added to training dataset. Periodically, ML model is retrained to the updated training model. This Active Learning loop is iterated until the cost of MD simulations becomes prohibitively expensive. The MD simulations will hopefully be sufficiently robust to support nucleation after many active learning iterations. In this sense, active learning scheme must automatically discover the important low energy and nonequilibrium physics.

Site Accession Number:: C22072

Software Type:: Scientific

License(s):: BSD 3-clause "New" or "Revised" License

Research Organization:: Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program

Primary Award/Contract Number:

AC52-06NA25396

DOE Contract Number:: AC52-06NA25396

Code ID:: 115537

OSTI ID:: code-115537

Country of Origin:: United States

Similar Records

Automated discovery of a robust interatomic potential for aluminum

Journal Article · Mon Feb 22 23:00:00 EST 2021 · Nature Communications · OSTI ID:1807862

Active learning for SNAP interatomic potentials via Bayesian predictive uncertainty

Journal Article · Tue May 14 00:00:00 EDT 2024 · Computational Materials Science · OSTI ID:2372952

Quasi-Classical Trajectory Calculation of Rate Constants Using an Ab Initio Trained Machine Learning Model (aML-MD) with Multifidelity Data

Journal Article · Sat Apr 20 00:00:00 EDT 2024 · Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory · OSTI ID:2502016

Active Learning Framework

Citation Formats

Similar Records

Related Subjects