Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A Fine-grained Asynchronous Bulk Synchronous parallelism model for PGAS applications

Journal Article · · Journal of Computational Science
 [1];  [2];  [3];  [2];  [2]
  1. Intel Corporation, Austin, TX (United States)
  2. Georgia Institute of Technology, Atlanta, GA (United States)
  3. Meta, Menlo Park, CA (United States)
The Partitioned Global Address Space (PGAS) model is well suited for executing irregular applications on cluster-based systems, due to its efficient support for short, one-sided messages. Separately, the actor model has been gaining popularity as a productive asynchronous message-passing approach for distributed objects in enterprise and cloud computing platforms, typically implemented in languages such as Erlang, Scala or Rust. To the best of our knowledge, there has been no past work on using the actor model to deliver both productivity and scalability to irregular PGAS applications with large number of small messages. In this paper, we introduce a new programming system for PGAS applications, in which point-to-point remote operations can be expressed as fine-grained asynchronous actor messages. In our approach, the programmer does not need to worry about programming complexities related to message aggregation and termination detection. Our approach can be viewed as extending the classical Bulk Synchronous Parallelism model with fine-grained asynchronous communications within a phase or superstep. Here, we believe that our approach offers a desirable point in the productivity-performance space for PGAS applications, with more scalable performance and higher productivity relative to past approaches. Specifically, for seven irregular mini-applications from the Bale Kernels and three graph kernels executed using 2048 cores in the NERSC Cori system, our approach shows geometric mean performance improvements of ≥ 20X relative to standard PGAS versions (UPC and OpenSHMEM) while maintaining comparable productivity to those versions.
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2422859
Alternate ID(s):
OSTI ID: 1970079
Journal Information:
Journal of Computational Science, Journal Name: Journal of Computational Science Journal Issue: C Vol. 69; ISSN 1877-7503
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (13)

HPCTOOLKIT: tools for performance analysis of optimized parallel programs journal January 2009
TRAM: Optimizing Fine-Grained Communication with Topological Routing and Aggregation of Messages
  • Wesolowski, Lukasz; Venkataraman, Ramprasad; Gupta, Abhishek
  • 2014 43nd International Conference on Parallel Processing (ICPP), 2014 43rd International Conference on Parallel Processing https://doi.org/10.1109/ICPP.2014.30
conference September 2014
You've Got Mail (YGM): Building Missing Asynchronous Communication Primitives conference May 2019
Actors Programming for the Mobile Cloud conference June 2014
A UPC++ Actor Library and Its Evaluation On a Shallow Water Proxy Application conference November 2019
The Distribution of the Flora in the Alpine Zone.1 journal February 1912
Productivity and performance using partitioned global address space languages conference January 2007
Improving communication in PGAS environments conference June 2013
LLVM-based communication optimizations for PGAS programs conference November 2015
Co-array Fortran for parallel programming journal August 1998
Orca: GC and type system co-design for actor languages
  • Clebsch, Sylvan; Franco, Juliana; Drossopoulou, Sophia
  • Proceedings of the ACM on Programming Languages, Vol. 1, Issue OOPSLA https://doi.org/10.1145/3133896
journal October 2017
Programming dynamically reconfigurable open systems with SALSA journal December 2001
A bridging model for parallel computation journal August 1990

Similar Records

On the Suitability of MPI as a PGAS Runtime
Conference · Wed Dec 17 23:00:00 EST 2014 · OSTI ID:1194324

Graph Algorithms in PGAS: Chapel and UPC++
Conference · Wed Sep 25 00:00:00 EDT 2019 · OSTI ID:1580595

OpenSHMEM-UCX: Evaluation of UCX for Implementing OpenSHMEM Programming Model, In: OpenSHMEM 2016: OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments
Conference · Thu Dec 31 23:00:00 EST 2015 · OPENSHMEM AND RELATED TECHNOLOGIES: ENHANCING OPENSHMEM FOR HYBRID ENVIRONMENTS · OSTI ID:1567413