HydraGNN v4.0
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Georgia Institute of Technology, Atlanta, GA (United States)
- University o California, Los Angeles, California, USA
The new version of HydraGNN v4.0 provides additional core capabilities, such as: Inclusion of multi-body atomistic cluster expansion MACE, polarizable atom interaction neural network PAINN, and equivariant principal neighborhood aggregation (PNAEq) among the message passing layers supported -Inclusion of graph transformers to directly model long-range interactions between nodes that are distant in the graph topology Integration of graph transformers with message passing layers by combining the graph embedding generated by the two mechanisms, which allows for an improved expressivity of the HydraGNN architecture Improved re-implementation of multi-task learning (MTL) to allow its use for stabilized training across imbalanced, multi-source, multi-fidelity data Introduction of multi-task parallelism, a newly proposed type of model parallelism specifically for MTL architectures, which allows to dispatch different output decoding heads to different GPU devices Integration of multi-task parallelism with pre-existing distributed data parallelism to enable a 2D parallelization for distributed training Improved portability of the distributed training across Intel GPUs, which has been testes on ALCF exascale supercomputer Aurora Inclusion of 2-level fine-grained energy profilers portable across NVIDIA, AMD, and Intel GPUs to monitor the power and energy consumption associated with different functions executed by the HydraGNN code during data pre-load and training Restructuring of previous examples and inclusion of new sets of examples to illustrate the download, preprocess, and training of HydraGNN models on new large-scale open-source datasets for atomistic materials modeling (e.g., Alexandria, Transition1x, OMat24, OMol25)
- Software Type:
- Scientific
- License(s):
- BSD 3-clause "New" or "Revised" License
- Programming Language(s):
- Python
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)Primary Award/Contract Number:AC05-00OR22725
- DOE Contract Number:
- AC05-00OR22725
- Code ID:
- 164024
- OSTI ID:
- code-164024
- Country of Origin:
- United States
Similar Records
User Manual - HydraGNN: Distributed PyTorch Implementation of Multi-Headed Graph Convolutional Neural Networks
Scalable training of trustworthy and energy-efficient predictive graph foundation models for atomistic materials modeling: a case study with HydraGNN
Technical Report
·
Wed Nov 01 00:00:00 EDT 2023
·
OSTI ID:2224153
Scalable training of trustworthy and energy-efficient predictive graph foundation models for atomistic materials modeling: a case study with HydraGNN
Journal Article
·
Thu Mar 13 20:00:00 EDT 2025
· Journal of Supercomputing
·
OSTI ID:2538215