skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Damaris: Addressing performance variability in data management for post-petascale simulations

Journal Article · · ACM Transactions on Parallel Computing
DOI:https://doi.org/10.1145/2987371· OSTI ID:1346736
 [1];  [2];  [1];  [1];  [3];  [2];  [2];  [1];  [4]
  1. Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Inria, Rennes - Bretagne Atlantique Research Centre (France)
  3. Univ. of Illinois at Urbana-Champaign, Urbana, IL (United States)
  4. Univ. of Wisconsin, Madison, WI (United States)

With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. Here, this variability significantly impacts the overall application performance at scale and its predictability over time. In this article, we present Damaris, a system that leverages dedicated cores in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of data movements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS’s Kraken and NCSA’s Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability. In addition, we extended our implementation of Damaris to also support the use of dedicated nodes and conducted a thorough comparison of the two approaches—dedicated cores and dedicated nodes—for I/O tasks with the aforementioned applications.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES); Central Michigan University; National Center for Atmospheric Research
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1346736
Journal Information:
ACM Transactions on Parallel Computing, Vol. 3, Issue 3; ISSN 2329-4949
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English

References (48)

Understanding the causes of performance variability in HPC workloads
  • Skinner, D.; Kramer, W.
  • IEEE International. 2005 IEEE Workload Characterization Symposium, 2005., IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005. https://doi.org/10.1109/IISWC.2005.1526010
conference January 2005
Parallel I/O performance: From events to ensembles conference April 2010
A Flexible Framework for Asynchronous in Situ and in Transit Analytics for Scientific Simulations conference May 2014
High end scientific codes with computational I/O pipelines: improving their end-to-end performance conference January 2011
Scalable I/O forwarding framework for high-performance computing systems conference August 2009
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533
conference January 2008
On implementing MPI-IO portably and with high performance conference January 1999
Electronic poster: co-visualization of full data and in situ data extracts from unstructured grid cfd at 160k cores
  • Rasquin, Michel; Sahni, Onkar; Fu, Jing
  • Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion - SC '11 Companion https://doi.org/10.1145/2148600.2148653
conference January 2011
Design and Evaluation of Multiple-Level Data Staging for Blue Gene Systems journal June 2011
Enabling high-speed asynchronous data extraction and transfer using DART journal January 2010
Scaling parallel I/O performance through I/O delegate and caching system conference November 2008
On the role of burst buffers in leadership-class storage systems conference April 2012
A Steering Environment for Online Parallel Visualization of Legacy Parallel Simulations
  • Esnard, Aurelien; Richart, Nicolas; Coulaud, Olivier
  • Proceedings. Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications, 2006 Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications https://doi.org/10.1109/DS-RT.2006.7
conference October 2006
An Adaptive Framework for Simulation and Online Remote Visualization of Critical Climate Applications in Resource-constrained Environments
  • Malakar, Preeti; Natarajan, Vijay; Vadhiyar, Sathish S.
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.10
conference November 2010
pClock: an arrival curve based approach for QoS guarantees in shared storage systems
  • Gulati, Ajay; Merchant, Arif; Varman, Peter J.
  • Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '07 https://doi.org/10.1145/1254882.1254885
conference January 2007
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
  • Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.18
conference November 2010
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
  • Prost, Jean-Pierre; Treumann, Richard; Hedges, Richard
  • Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01 https://doi.org/10.1145/582034.582051
conference January 2001
QoS support for end users of I/O-intensive applications using shared storage systems
  • Zhang, Xuechen; Davis, Kei; Jiang, Song
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063408
conference January 2011
Examples of in transit visualization
  • Moreland, Kenneth; Hereld, Mark; Papka, Michael E.
  • Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities - PDAC '11 https://doi.org/10.1145/2110205.2110207
conference January 2011
ExaViz: a flexible framework to analyse, steer and interact with molecular dynamics simulations journal January 2014
Scalable parallel building blocks for custom data analysis conference October 2011
CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
  • Dorier, Matthieu; Antoniu, Gabriel; Ross, Rob
  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2014.27
conference May 2014
The ParaView Coprocessing Library: A scalable, general purpose in situ visualization library conference October 2011
Scalable systems software---From mesh generation to scientific visualization: an end-to-end approach to parallel supercomputing conference January 2006
A Benchmark Simulation for Moist Nonhydrostatic Numerical Models journal December 2002
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures
  • Li, Min; Vazhkudai, Sudharshan S.; Butt, Ali R.
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.28
conference November 2010
Interactive simulation and visualization journal January 1999
In-situ processing and visualization for ultrascale simulations journal July 2007
Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O conference September 2012
Visualizing with VTK: a tutorial journal January 2000
Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform
  • Zhang, Fan; Docan, Ciprian; Parashar, Manish
  • 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2012 IEEE 26th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2012.122
conference May 2012
DataStager: scalable data staging services for petascale applications conference January 2009
In-situ I/O processing: a case for location flexibility conference January 2011
Managing Variability in the IO Performance of Petascale Storage Systems
  • Lofstead, Jay; Zheng, Fang; Liu, Qing
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.32
conference November 2010
High-level buffering for hiding periodic output cost in scientific simulations journal March 2006
Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines conference June 2011
Data sieving and collective I/O in ROMIO conference January 1999
PreDatA – preparatory data analytics on peta-scale machines conference April 2010
I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6 conference January 2012
In-situ Feature-Based Objects Tracking for Large-Scale Scientific Simulations
  • Zhang, Fan; Lasluisa, Solomon; Jin, Tong
  • 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: High Performance Computing, Networking Storage and Analysis https://doi.org/10.1109/SC.Companion.2012.100
conference November 2012
Extreme Scaling of Production Visualization Software on Diverse Architectures journal May 2010
Comparative evaluation of overlap strategies with study of I/O overlap in MPI-IO journal October 2008
A study of I/O methods for parallel visualization of large-scale data journal February 2005
In Situ Visualization at Extreme Scale: Challenges and Opportunities journal November 2009
In Situ Visualization for Large-Scale Combustion Simulations journal May 2010
Understanding Performance Interference of I/O Workload in Virtualized Cloud Environments
  • Pu, Xing; Liu, Ling; Mei, Yiduo
  • 2010 IEEE International Conference on Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on Cloud Computing https://doi.org/10.1109/CLOUD.2010.65
conference July 2010
Concurrent Visualization in a Production Supercomputing Environment journal September 2006
Scheduling the I/O of HPC Applications Under Congestion conference May 2015

Cited By (1)

CoSS: proposing a contract-based storage system for HPC
  • Dorier, Matthieu; Dreher, Matthieu; Peterka, Tom
  • Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems - PDSW-DISCS '17 https://doi.org/10.1145/3149393.3149396
conference January 2017