DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

Abstract

This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once theymore » support the new collective procedures proposed for Fortran 2015.« less

Authors:
 [1];  [2];  [3];  [4];  [5]
  1. EXA High Performance Computing, 1087 Nicosia, Cyprus
  2. Stanford University, Stanford, CA 94305, USA
  3. Sandia National Laboratories, Livermore, CA 94550, USA
  4. University of Oregon, Eugene, OR 97403, USA
  5. Computational Sciences Laboratory (UCY-CompSci), University of Cyprus, 1678 Nicosia, Cyprus
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1197693
Grant/Contract Number:  
AC02-05CH11231; AC04-94-AL85000
Resource Type:
Published Article
Journal Name:
Scientific Programming
Additional Journal Information:
Journal Name: Scientific Programming Journal Volume: 2015; Journal ID: ISSN 1058-9244
Publisher:
Hindawi Publishing Corporation
Country of Publication:
Egypt
Language:
English

Citation Formats

Radhakrishnan, Hari, Rouson, Damian W. I., Morris, Karla, Shende, Sameer, and Kassinos, Stavros C.. Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study. Egypt: N. p., 2015. Web. doi:10.1155/2015/904983.
Radhakrishnan, Hari, Rouson, Damian W. I., Morris, Karla, Shende, Sameer, & Kassinos, Stavros C.. Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study. Egypt. https://doi.org/10.1155/2015/904983
Radhakrishnan, Hari, Rouson, Damian W. I., Morris, Karla, Shende, Sameer, and Kassinos, Stavros C.. Thu . "Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study". Egypt. https://doi.org/10.1155/2015/904983.
@article{osti_1197693,
title = {Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study},
author = {Radhakrishnan, Hari and Rouson, Damian W. I. and Morris, Karla and Shende, Sameer and Kassinos, Stavros C.},
abstractNote = {This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.},
doi = {10.1155/2015/904983},
journal = {Scientific Programming},
number = ,
volume = 2015,
place = {Egypt},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1155/2015/904983

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share: