Understanding the Machine Learning Needs of ECP Applications
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
In order to support the codesign needs of ECP applications in current and future hardware in the area of machine learning, the ExaLearn team at Sandia studied the different machine learning use cases in three different ECP applications. This report is a summary of the needs of the three applications. The Sandia ExaLearn team will develop a proxy application representative of ECP application needs, specifically the ExaSky and EXAALT ECP projects. The proxy application will allow us to demonstrate performance portable kernels within machine learning codes. Furthermore, current training scalability of machine learning networks in these applications is negatively affected by large batch sizes. Training throughput of the network will increase as batch size increases, but network accuracy and generalization worsens. The proxy application will contain hybrid model- and data-parallelism to improve training efficiency while maintaining network accuracy. The proxy application will also target optimizing 3D convolutional layers, specific to scientific machine learning, which have not been as thoroughly explored by industry.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE National Nuclear Security Administration (NNSA)
- DOE Contract Number:
- AC04-94AL85000; NA0003525
- OSTI ID:
- 1491603
- Report Number(s):
- SAND-2019-0498R; 671613
- Country of Publication:
- United States
- Language:
- English
Similar Records
ECP Report: Update on Proxy Applications and Vendor Interactions
miniGAN: a proxy application for generative adversarial networks