Damaris: Addressing performance variability in data management for post-petascale simulations

Dorier, Matthieu; Antoniu, Gabriel; Cappello, Franck; Snir, Marc; Sisneros, Robert; Yildiz, Orcun; Ibrahim, Shadi; Peterka, Tom; Orf, Leigh

doi:10.1145/2987371

Damaris: Addressing performance variability in data management for post-petascale simulations

Journal Article · Sat Oct 01 00:00:00 EDT 2016 · ACM Transactions on Parallel Computing

DOI:https://doi.org/10.1145/2987371· OSTI ID:1346736

Dorier, Matthieu ^[1]; Antoniu, Gabriel ^[2]; Cappello, Franck ^[1]; Snir, Marc ^[1]; Sisneros, Robert ^[3]; Yildiz, Orcun ^[2]; Ibrahim, Shadi ^[2]; Peterka, Tom ^[1]; Orf, Leigh ^[4]

Argonne National Lab. (ANL), Argonne, IL (United States)
Inria, Rennes - Bretagne Atlantique Research Centre (France)
Univ. of Illinois at Urbana-Champaign, Urbana, IL (United States)
Univ. of Wisconsin, Madison, WI (United States)

With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. Here, this variability significantly impacts the overall application performance at scale and its predictability over time. In this article, we present Damaris, a system that leverages dedicated cores in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of data movements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS’s Kraken and NCSA’s Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability. In addition, we extended our implementation of Damaris to also support the use of dedicated nodes and conducted a thorough comparison of the two approaches—dedicated cores and dedicated nodes—for I/O tasks with the aforementioned applications.

View Accepted Manuscript (DOE)

Research Organization:: Argonne National Laboratory (ANL)

Sponsoring Organization:: USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22); Central Michigan University; National Center for Atmospheric Research

Grant/Contract Number:: AC02-06CH11357

OSTI ID:: 1346736

Journal Information:: ACM Transactions on Parallel Computing, Journal Name: ACM Transactions on Parallel Computing Journal Issue: 3 Vol. 3; ISSN 2329-4949

Publisher:: Association for Computing MachineryCopyright Statement

Country of Publication:: United States

Language:: English

References (51)

A Steering Environment for Online Parallel Visualization of Legacy Parallel Simulations Esnard, Aurelien; Richart, Nicolas; Coulaud, Olivier Proceedings. Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications, 2006 Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications https://doi.org/10.1109/ds-rt.2006.7	conference	October 2006
CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination Dorier, Matthieu; Antoniu, Gabriel; Ross, Rob 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/ipdps.2014.27	conference	May 2014
Scheduling the I/O of HPC Applications Under Congestion Gainaru, Ana; Aupy, Guillaume; Benoit, Anne 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/ipdps.2015.116	conference	May 2015
Enabling high-speed asynchronous data extraction and transfer using DART Docan, Ciprian; Parashar, Manish; Klasky, Scott Concurrency and Computation: Practice and Experience https://doi.org/10.1002/cpe.1567	journal	January 2010
A study of I/O methods for parallel visualization of large-scale data Yu, Hongfeng; Ma, Kwan-Liu Parallel Computing, Vol. 31, Issue 2 https://doi.org/10.1016/j.parco.2005.02.004	journal	February 2005
ExaViz: a flexible framework to analyse, steer and interact with molecular dynamics simulations Dreher, Matthieu; Prevoteau-Jonquet, Jessica; Trellet, Mikael Faraday Discuss., Vol. 169 https://doi.org/10.1039/c3fd00142c	journal	January 2014
In-situ processing and visualization for ultrascale simulations Ma, Kwan-Liu; Wang, Chaoli; Yu, Hongfeng Journal of Physics: Conference Series, Vol. 78 https://doi.org/10.1088/1742-6596/78/1/012043	journal	July 2007
Interactive simulation and visualization Johnson, C.; Parker, S. G.; Hansen, C. Computer, Vol. 32, Issue 12 https://doi.org/10.1109/2.809252	journal	January 1999
Visualizing with VTK: a tutorial Schroeder, W. J.; Avila, L. S.; Hoffman, W. IEEE Computer Graphics and Applications, Vol. 20, Issue 5 https://doi.org/10.1109/38.865875	journal	January 2000
A Flexible Framework for Asynchronous in Situ and in Transit Analytics for Scientific Simulations Dreher, Matthieu; Raffin, Bruno 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) https://doi.org/10.1109/CCGrid.2014.92	conference	May 2014
Understanding Performance Interference of I/O Workload in Virtualized Cloud Environments Pu, Xing; Liu, Ling; Mei, Yiduo 2010 IEEE International Conference on Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on Cloud Computing https://doi.org/10.1109/CLOUD.2010.65	conference	July 2010
Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O Dorier, Matthieu; Antoniu, Gabriel; Cappello, Franck 2012 IEEE International Conference on Cluster Computing (CLUSTER) https://doi.org/10.1109/CLUSTER.2012.26	conference	September 2012
Scalable I/O forwarding framework for high-performance computing systems Ali, Nawab; Carns, Philip; Iskra, Kamil 2009 IEEE International Conference on Cluster Computing and Workshops https://doi.org/10.1109/CLUSTR.2009.5289188	conference	August 2009
A Steering Environment for Online Parallel Visualization of Legacy Parallel Simulations Esnard, Aurelien; Richart, Nicolas; Coulaud, Olivier Proceedings. Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications, 2006 Tenth IEEE International Symposium on Distributed Simulation and Real-Time Applications https://doi.org/10.1109/DS-RT.2006.7	conference	October 2006
Data sieving and collective I/O in ROMIO Thakur, R.; Gropp, W.; Lusk, E. Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation https://doi.org/10.1109/FMPC.1999.750599	conference	January 1999
Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines Prabhakar, Ramya; Vazhkudai, Sudharshan S.; Kim, Youngjae 2011 31st International Conference on Distributed Computing Systems (ICDCS) https://doi.org/10.1109/ICDCS.2011.33	conference	June 2011
Understanding the causes of performance variability in HPC workloads Skinner, D.; Kramer, W. IEEE International. 2005 IEEE Workload Characterization Symposium, 2005., IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005. https://doi.org/10.1109/IISWC.2005.1526010	conference	January 2005
Parallel I/O performance: From events to ensembles Uselton, Andrew; Howison, Mark; Wright, Nicholas J. 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) https://doi.org/10.1109/IPDPS.2010.5470424	conference	April 2010
PreDatA – preparatory data analytics on peta-scale machines Zheng, Fang; Abbasi, Hasan; Docan, Ciprian 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) https://doi.org/10.1109/IPDPS.2010.5470454	conference	April 2010
Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform Zhang, Fan; Docan, Ciprian; Parashar, Manish 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2012 IEEE 26th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2012.122	conference	May 2012
CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination Dorier, Matthieu; Antoniu, Gabriel; Ross, Rob 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2014.27	conference	May 2014
Scheduling the I/O of HPC Applications Under Congestion Gainaru, Ana; Aupy, Guillaume; Benoit, Anne 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/IPDPS.2015.116	conference	May 2015
The ParaView Coprocessing Library: A scalable, general purpose in situ visualization library Fabian, Nathan; Moreland, Kenneth; Thompson, David 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV) https://doi.org/10.1109/LDAV.2011.6092322	conference	October 2011
Scalable parallel building blocks for custom data analysis Peterka, Tom; Ross, Robert; Gyulassy, Attila 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV) https://doi.org/10.1109/LDAV.2011.6092324	conference	October 2011
In Situ Visualization at Extreme Scale: Challenges and Opportunities No authors listed IEEE Computer Graphics and Applications, Vol. 29, Issue 6 https://doi.org/10.1109/MCG.2009.120	journal	November 2009
Extreme Scaling of Production Visualization Software on Diverse Architectures Childs, Hank; Pugmire, David; Ahern, Sean IEEE Computer Graphics and Applications, Vol. 30, Issue 3 https://doi.org/10.1109/MCG.2010.51	journal	May 2010
In Situ Visualization for Large-Scale Combustion Simulations No authors listed IEEE Computer Graphics and Applications, Vol. 30, Issue 3 https://doi.org/10.1109/MCG.2010.55	journal	May 2010
On the role of burst buffers in leadership-class storage systems Liu, Ning; Cope, Jason; Carns, Philip 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST) https://doi.org/10.1109/MSST.2012.6232369	conference	April 2012
Scaling parallel I/O performance through I/O delegate and caching system Nisar, Arifa; Liao, Wei-keng; Choudhary, Alok 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2008.5214358	conference	November 2008
An Adaptive Framework for Simulation and Online Remote Visualization of Critical Climate Applications in Resource-constrained Environments Malakar, Preeti; Natarajan, Vijay; Vadhiyar, Sathish S. 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.10	conference	November 2010
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.18	conference	November 2010
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures Li, Min; Vazhkudai, Sudharshan S.; Butt, Ali R. 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.28	conference	November 2010
Managing Variability in the IO Performance of Petascale Storage Systems Lofstead, Jay; Zheng, Fang; Liu, Qing 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.32	conference	November 2010
In-situ Feature-Based Objects Tracking for Large-Scale Scientific Simulations Zhang, Fan; Lasluisa, Solomon; Jin, Tong 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: High Performance Computing, Networking Storage and Analysis https://doi.org/10.1109/SC.Companion.2012.100	conference	November 2012
High-level buffering for hiding periodic output cost in scientific simulations Ma, X.; Lee, J.; Winslett, M. IEEE Transactions on Parallel and Distributed Systems, Vol. 17, Issue 3 https://doi.org/10.1109/TPDS.2006.36	journal	March 2006
Design and Evaluation of Multiple-Level Data Staging for Blue Gene Systems Isaila, F.; Garcia Blas, J.; Carretero, J. IEEE Transactions on Parallel and Distributed Systems, Vol. 22, Issue 6 https://doi.org/10.1109/TPDS.2010.127	journal	June 2011
Concurrent Visualization in a Production Supercomputing Environment Ellsworth, D.; Green, B.; Henze, C. IEEE Transactions on Visualization and Computer Graphics, Vol. 12, Issue 5 https://doi.org/10.1109/TVCG.2006.128	journal	September 2006
Scalable systems software---From mesh generation to scientific visualization: an end-to-end approach to parallel supercomputing Tu, Tiankai; Yu, Hongfeng; Ramirez-Guzman, Leonardo Proceedings of the 2006 ACM/IEEE conference on Supercomputing - SC '06 https://doi.org/10.1145/1188455.1188551	conference	January 2006
pClock: an arrival curve based approach for QoS guarantees in shared storage systems Gulati, Ajay; Merchant, Arif; Varman, Peter J. Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '07 https://doi.org/10.1145/1254882.1254885	conference	January 2007
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533	conference	January 2008
Comparative evaluation of overlap strategies with study of I/O overlap in MPI-IO Patrick, Christina M.; Son, SeungWoo; Kandemir, Mahmut ACM SIGOPS Operating Systems Review, Vol. 42, Issue 6 https://doi.org/10.1145/1453775.1453784	journal	October 2008
DataStager: scalable data staging services for petascale applications Abbasi, Hasan; Wolf, Matthew; Eisenhauer, Greg Proceedings of the 18th ACM international symposium on High performance distributed computing - HPDC '09 https://doi.org/10.1145/1551609.1551618	conference	January 2009
QoS support for end users of I/O-intensive applications using shared storage systems Zhang, Xuechen; Davis, Kei; Jiang, Song Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063408	conference	January 2011
Examples of in transit visualization Moreland, Kenneth; Hereld, Mark; Papka, Michael E. Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities - PDAC '11 https://doi.org/10.1145/2110205.2110207	conference	January 2011
High end scientific codes with computational I/O pipelines: improving their end-to-end performance Zheng, Fang; Cao, Jianting; Dayal, Jai Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities - PDAC '11 https://doi.org/10.1145/2110205.2110210	conference	January 2011
Electronic poster: co-visualization of full data and in situ data extracts from unstructured grid cfd at 160k cores Rasquin, Michel; Sahni, Onkar; Fu, Jing Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion - SC '11 Companion https://doi.org/10.1145/2148600.2148653	conference	January 2011
In-situ I/O processing: a case for location flexibility Zheng, Fang; Abbasi, Hasan; Cao, Jianting Proceedings of the sixth workshop on Parallel Data Storage - PDSW '11 https://doi.org/10.1145/2159352.2159362	conference	January 2011
I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6 Fu, Jing; Latham, Robert; Min, Misun Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '12 https://doi.org/10.1145/2318916.2318919	conference	January 2012
On implementing MPI-IO portably and with high performance Thakur, Rajeev; Gropp, William; Lusk, Ewing Proceedings of the sixth workshop on I/O in parallel and distributed systems - IOPADS '99 https://doi.org/10.1145/301816.301826	conference	January 1999
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS Prost, Jean-Pierre; Treumann, Richard; Hedges, Richard Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01 https://doi.org/10.1145/582034.582051	conference	January 2001
A Benchmark Simulation for Moist Nonhydrostatic Numerical Models Bryan, George H.; Fritsch, J. Michael Monthly Weather Review, Vol. 130, Issue 12 https://doi.org/10.1175/1520-0493(2002)130<2917:ABSFMN>2.0.CO;2	journal	December 2002

Cited By (1)

CoSS: proposing a contract-based storage system for HPC Dorier, Matthieu; Dreher, Matthieu; Peterka, Tom Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems - PDSW-DISCS '17 https://doi.org/10.1145/3149393.3149396	conference	January 2017

Similar Records

On the energy footprint of I/O management in Exascale HPC systems

Journal Article · Mon Mar 21 00:00:00 EDT 2016 · Future Generations Computer Systems · OSTI ID:1390829

Lustre at Petascale: Experiences in Troubleshooting and Upgrading

Conference · Sat Dec 31 23:00:00 EST 2011 · OSTI ID:1039643

Distributed Data-Flow for In-Situ Visualization and Analysis at Petascale

Technical Report · Fri Mar 13 00:00:00 EDT 2009 · OSTI ID:1020344

Related Subjects

97 MATHEMATICS AND COMPUTING
Damaris
Dedicated Cores
Dedicated Nodes
Design
Exascale Computing
Experimentation
I/O
In Situ Visualization
Performance

Damaris: Addressing performance variability in data management for post-petascale simulations

Citation Formats

References (51)

Cited By (1)

Similar Records

Related Subjects