Optimizing I/O Performance of HPC Applications with Autotuning

Behzad, Babak; Byna, Surendra; Prabhat, None; Snir, Marc

doi:10.1145/3309205

Title: Optimizing I/O Performance of HPC Applications with Autotuning

Journal Article · Fri Mar 08 00:00:00 EST 2019 · ACM Transactions on Parallel Computing

DOI:https://doi.org/10.1145/3309205· OSTI ID:1825486

Behzad, Babak ^[1]; Byna, Surendra ^[2]; Prabhat, None ^[2]; Snir, Marc ^[3]

Univ. of Illinois at Urbana-Champaign, IL (United States)
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Univ. of Illinois at Urbana-Champaign, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)

Parallel Input output is an essential component of modern high-performance computing (HPC). Obtaining good I/O performance for a broad range of applications on diverse HPC platforms is a major challenge, in part, because of complex inter dependencies between I/O middleware and hardware. The parallel file system and I/O middleware layers all offer optimization parameters that can, in theory, result in better I/O performance. Unfortunately, the right combination of parameters is highly dependent on the application, HPC platform, problem size, and concurrency. Scientific application developers do not have the time or expertise to take on the substantial burden of identifying good parameters for each problem configuration. They resort to using system defaults, a choice that frequently results in poor I/O performance. We expect this problem to be compounded on exascale-class machines, which will likely have a deeper software stack with hierarchically arranged hardware resources.We present as a solution to this problem an autotuning system for optimizing I/O performance, I/O performance modeling, I/O tuning, and I/O patterns. We demonstrate the value of this framework across several HPC platforms and applications at scale.

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 1825486

Journal Information:: ACM Transactions on Parallel Computing, Vol. 5, Issue 4; ISSN 2329-4949

Publisher:: Association for Computing Machinery

Country of Publication:: United States

Language:: English

References (23)

Minerva: An automated resource provisioning tool for large-scale storage systems Alvarez, Guillermo A.; Borowsky, Elizabeth; Go, Susie ACM Transactions on Computer Systems, Vol. 19, Issue 4 https://doi.org/10.1145/502912.502915	journal	November 2001
Taming parallel I/O complexity with auto-tuning Behzad, Babak; Luu, Huong Vu Thanh; Huchette, Joseph SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/2503210.2503278	conference	November 2013
I/O acceleration with pattern detection He, Jun; Bent, John; Torres, Aaron Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13 https://doi.org/10.1145/2493123.2462909	conference	January 2013
Automatic parallel I/O performance optimization in Panda Chen, Y.; Winslett, M.; Cho, Y. Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures - SPAA '98 https://doi.org/10.1145/277651.277677	conference	January 1998
Improving parallel I/O autotuning with performance modeling Behzad, Babak; Byna, Surendra; Wild, Stefan M. Proceedings of the 23rd international symposium on High-performance parallel and distributed computing - HPDC '14 https://doi.org/10.1145/2600212.2600708	conference	January 2014
Skel: Generative Software for Producing Skeletal I/O Applications Logan, Jeremy; Klasky, Scott; Lofstead, Jay 2011 IEEE Seventh International Conference on e-Science Workshops (eScienceW) https://doi.org/10.1109/eScienceW.2011.26	conference	December 2011
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology Bilmes, Jeff; Asanovic, Krste; Chin, Chee-Whye Proceedings of the 11th international conference on Supercomputing - ICS '97 https://doi.org/10.1145/263580.263662	conference	January 1997
Performance modeling for the panda array I/O library Chen, Ying; Winslett, Marianne; Kuo, Szu-wen Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '96 https://doi.org/10.1145/369028.369122	conference	January 1996
Breaking the Cloud Parameterization Deadlock Randall, David; Khairoutdinov, Marat; Arakawa, Akio Bulletin of the American Meteorological Society, Vol. 84, Issue 11 https://doi.org/10.1175/BAMS-84-11-1547	journal	November 2003
VORPAL: a versatile plasma simulation code Nieter, Chet; Cary, John R. Journal of Computational Physics, Vol. 196, Issue 2 https://doi.org/10.1016/j.jcp.2003.11.004	journal	May 2004
Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction Dorier, Matthieu; Ibrahim, Shadi; Antoniu, Gabriel SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.56	conference	November 2014
Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation Bowers, K. J.; Albright, B. J.; Yin, L. Physics of Plasmas, Vol. 15, Issue 5 https://doi.org/10.1063/1.2840133	journal	May 2008
A Comparison of Logical and Physical Parallel I/o pAtterns Simitci, Huseyin; Reed, Daniel A. The International Journal of High Performance Computing Applications, Vol. 12, Issue 3 https://doi.org/10.1177/109434209801200305	journal	September 1998
Optimization of sparse matrix-vector multiplication on emerging multicore platforms Williams, Samuel; Oliker, Leonid; Vuduc, Richard Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07 https://doi.org/10.1145/1362622.1362674	conference	January 2007
Improved parallel I/O via a two-phase run-time access strategy del Rosario, Juan Miguel; Bordawekar, Rajesh; Choudhary, Alok ACM SIGARCH Computer Architecture News, Vol. 21, Issue 5 https://doi.org/10.1145/165660.165667	journal	December 1993
A multi-level approach for understanding I/O activity in HPC applications Luu, Huong; Behzad, Babak; Aydt, Ruth 2013 IEEE International Conference on Cluster Computing (CLUSTER) https://doi.org/10.1109/CLUSTER.2013.6702690	conference	September 2013
Online Adaptive Code Generation and Tuning Tiwari, Ananta; Hollingsworth, Jeffrey K. Distributed Processing Symposium (IPDPS), 2011 IEEE International Parallel & Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2011.86	conference	May 2011
Modeling and Predicting Disk I/O Time of HPC Applications Meswani, Mitesh R.; Laurenzano, Michael A.; Carrington, Laura 2010 DoD High Performance Computing Modernization Program Users Group Conference (HPCMP-UGC) https://doi.org/10.1109/HPCMP-UGC.2010.27	conference	June 2010
Cost-intelligent application-specific data layout optimization for parallel file systems Song, Huaiming; Yin, Yanlong; Chen, Yong Cluster Computing, Vol. 16, Issue 2 https://doi.org/10.1007/s10586-012-0200-4	journal	February 2012
PERI - auto-tuning memory-intensive kernels for multicore Williams, S.; Datta, K.; Carter, J. Journal of Physics: Conference Series, Vol. 125 https://doi.org/10.1088/1742-6596/125/1/012038	journal	July 2008
An Overview of Evolutionary Algorithms for Parameter Optimization Bäck, Thomas; Schwefel, Hans-Paul Evolutionary Computation, Vol. 1, Issue 1 https://doi.org/10.1162/evco.1993.1.1.1	journal	March 1993
I/O performance challenges at leadership scale Lang, Samuel; Carns, Philip; Latham, Robert Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09 https://doi.org/10.1145/1654059.1654100	conference	January 2009
Lessons from characterizing the input/output behavior of parallel scientific applications Smirni, E.; Reed, D. A. Performance Evaluation, Vol. 33, Issue 1, p. 27-44 https://doi.org/10.1016/S0166-5316(98)00009-1	journal	June 1998

Similar Records

...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats

Conference · Thu Jan 01 00:00:00 EST 2009 · OSTI ID:1825486

Klasky, Scott A; Lofstead, J.; Bent, John; +5 more

Data Locality Enhancement of Dynamic Simulations for Exascale Computing (Final Report)

Technical Report · Fri Nov 29 00:00:00 EST 2019 · OSTI ID:1825486

Shen, Xipeng

Institute for Sustained Performance, Energy, and Resilience (SuPER)

Technical Report · Wed Nov 30 00:00:00 EST 2016 · OSTI ID:1825486

Jagode, Heike; Bosilca, George; Danalis, Anthony; +2 more

Title: Optimizing I/O Performance of HPC Applications with Autotuning

Citation Formats

References (23)

Similar Records

Related Subjects