skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

Journal Article · · Concurrency and Computation. Practice and Experience
DOI:https://doi.org/10.1002/cpe.3457· OSTI ID:1224742

Summary The growth in size of networked high performance computers along with novel accelerator‐based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub‐optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on the performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter‐task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm‐based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. Application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, thereby enabling the applications to achieve better time to solution and scalability on Titan during production. Copyright © 2015 John Wiley & Sons, Ltd.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1224742
Alternate ID(s):
OSTI ID: 1400703
Journal Information:
Concurrency and Computation. Practice and Experience, Vol. 27, Issue 17; ISSN 1532-0626
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

References (27)

Greedy Randomized Adaptive Search Procedures journal March 1995
Rupture mechanism of liquid crystal thin films realized by large-scale molecular simulations journal January 2014
An Evaluation of Molecular Dynamics Performance on the Hybrid Cray XK6 Supercomputer journal January 2012
Simulation of laminar and turbulent impeller stirred tanks using immersed boundary method and large eddy simulation technique in multi-block curvilinear geometries journal March 2007
Heuristic technique for processor and link assignment in multicomputers journal March 1991
Implementing molecular dynamics on hybrid high performance computers – short range forces journal April 2011
A randomized heuristics for the mapping problem: The genetic approach journal October 1992
On the Mapping Problem journal March 1981
Genetic algorithm based heuristics for the mapping problem journal January 1995
Large eddy simulation of turbulence-chemistry interactions in reacting flows journal September 2006
Parallel search for combinatorial optimization: Genetic algorithms, simulated annealing, tabu search and GRASP book January 1995
Noncontiguous processor allocation algorithms for mesh-connected multicomputers journal July 1997
An approach to mapping parallel programs on hypercube multiprocessors conference January 1999
Optimization-based mapping framework for parallel applications journal October 2011
A survey for the quadratic assignment problem journal January 2007
Strategies to Map Parallel Applications onto Meshes book January 2010
Low-storage, explicit Runge–Kutta schemes for the compressible Navier–Stokes equations journal November 2000
Optimization by Simulated Annealing journal May 1983
New insights into the dynamics and morphology of P3HT:PCBM active layers in bulk heterojunctions journal January 2013
Fast Parallel Algorithms for Short-Range Molecular Dynamics journal March 1995
Task mapping stencil computations for non-contiguous allocations
  • Leung, Vitus J.; Bunde, David P.; Ebbers, Jonathan
  • Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '14 https://doi.org/10.1145/2555243.2555277
conference January 2014
Hybridizing S3D into an Exascale application using OpenACC: An approach for moving to multi-petaflops and beyond
  • Levesque, John M.; Sankaran, Ramanan; Grout, Ray
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.69
conference November 2012
Heuristic-Based Techniques for Mapping Irregular Communication Graphs to Mesh Topologies conference September 2011
Communication patterns and allocation strategies conference January 2004
Contention-aware node allocation policy for high-performance capacity systems
  • Jokanovic, Ana; Minkenberg, Cyriel; Sancho, Jose Carlos
  • Proceedings of the 2012 Interconnection Network Architecture on On-Chip, Multi-Chip Workshop - INA-OCMC '12 https://doi.org/10.1145/2107763.2107765
conference January 2012
Cray Cascade: A scalable HPC system based on a Dragonfly network
  • Faanes, Greg; Bataineh, Abdulla; Roweth, Duncan
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.39
conference November 2012
Generic topology mapping strategies for large-scale parallel architectures conference January 2011

Cited By (1)

Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers
  • Sreepathi, Sarat; D'Azevedo, Ed; Philip, Bobby
  • ICPE'16: ACM/SPEC International Conference on Performance Engineering, Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering https://doi.org/10.1145/2851553.2851575
conference March 2016