skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: AI-Science for Performance Optimization and Diagnosis of Science Instrument Federations

Conference ·
OSTI ID:1649004

Next generation of science workflows are expected to be executed over complex federations composed of supercomputers, science instruments, storage systems and networks, with new additions of the edge and cloud systems and services. The sheer complexity of these multi-domain federations makes it hard to manage them and optimize their performance, as small impedance mismatches (that can dynamically develop between systems) could drastically degrade the entire federation performance. Recent proliferation of Software Defined Everything (SDX) technologies combined with containerization frameworks provide custom instruments that can monitor and collect critical measurements at various levels to support diagnoses and performance optimization; but their data too enormous for human operators and analysts to process and generate decisions. Machine Learning (ML) methods that extract critical parameters, relationships and trends from the data offer general solutions. Artificial Intelligence (AI) and ML methods must be custom-developed for these problems based on solid, rigorous foundations, since black-box approaches are often ineffective and unsound.We propose to develop comprehensive AI-Science for the performance of science federations to (i) monitor and control storage, networks, experiments, and computing systems across multiple domains via softwarization layers, at speeds and scales orders of magnitude superior to current practice, (ii) optimally realize and orchestrate complex workflows with high performance by using dynamic state and performance estimation methods, and (iii) aggregate measurements across sites and time to develop infrastructure-level profiles, optimizations and diagnoses using AI-Science based on foundational principles from ML, game theory, and information fusion areas.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1649004
Resource Relation:
Conference: DOE ASCR Workshop on Future Scientific Methodologies - Washington, District of Columbia, United States of America - 8/4/2020 4:00:00 AM-8/6/2020 4:00:00 AM
Country of Publication:
United States
Language:
English

Similar Records

Related Subjects