An Integrated Approach to Scaling Task-Based Runtime Systems for Next Generation Engineering problems
Journal Article
·
· International Conference for High Performance Computing, Networking, Storage and Analysis
OSTI ID:1755953
- Univ. of Utah, Salt Lake City, UT (United States); University of Utah
- Univ. of Utah, Salt Lake City, UT (United States)
The need to scale next-generation industrial engineering problems to the largest computational platforms presents unique challenges. Such problems may have complex coupled multi-physics models, as well as computationally and communication intensive algorithms not often seen in standard academic problems, while also needing high-demand I/O for analysis purposes. In such cases even codes with good existing scaling properties may need significant changes, addressed in a cross-cutting way to solve such problems at extreme scales. These challenges relate to system components that were not previously problematic on less complex problems and/or at smaller computational scales. This paper illustrates these challenges for Uintah, a highly scalable asynchronous many-task runtime system being applied to the modeling of a 1000 MWe ultra-supercritical coal boiler. In order to model this formidable, production engineering problem, not only was an integrated approach needed that built upon existing scalable components in areas such as complex stencil calculations and linear algebra, but the significant challenges of high demand I/O, complex task graphs, and infrastructure inefficiencies also had to be addressed. This integrated approach allowed this problem to run on 256K CPU cores on Mira, with good weak scaling to 512K CPU cores and similar strong scaling, except for radiation. The need for strong scaling of a new ray tracing-based radiation model required substantial Uintah infrastructure improvements before 119K CPU cores and 7.5K GPUs on Titan could be used, with scaling out to 256K CPU cores and 16K GPUs. The resulting code demonstrates not only excellent overall scalability but a significant improvement in overall performance for the full-scale, production boiler problem.
- Research Organization:
- Univ. of Utah, Salt Lake City, UT (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities Division
- Grant/Contract Number:
- NA0002375; AC02-06CH11357; AC05-00OR22725
- OSTI ID:
- 1755953
- Journal Information:
- International Conference for High Performance Computing, Networking, Storage and Analysis, Journal Name: International Conference for High Performance Computing, Networking, Storage and Analysis Vol. 2017; ISSN 2167-4329
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs
The uintah framework: a unified heterogeneous task scheduling and runtime system
Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers
Conference
·
Wed Nov 01 00:00:00 EDT 2017
· Proceedings of the 3rd International IEEE Workshop on Extreme Scale Programming Models and Middleware
·
OSTI ID:1582428
The uintah framework: a unified heterogeneous task scheduling and runtime system
Conference
·
Thu Nov 01 00:00:00 EDT 2012
· 2012 SC Companion: High Performance Computing, Networking Storage and Analysis; 10-16 Nov. 2012; Salt Lake City, UT, USA
·
OSTI ID:1567606
Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers
Conference
·
Mon Dec 31 23:00:00 EST 2012
·
OSTI ID:1567631