skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Adaptive Runtime Features For Distributed Graph Algorithms

Abstract

Performance of distributed graph algorithms can benefit greatly by forming rapport between algorithmic abstraction and the underlying runtime system that is responsible for scheduling work and exchanging messages. However, due to their dynamic and irregular nature of computation, distributed graph algorithms written in different programming models impose varying degree of workload pressure on the runtime. To cope with such vastly different workload characteristics, a runtime has to make several trade-offs. One such trade-off arises, for example, when the runtime scheduler has to choose among alternatives such as whether to execute algorithmic work, or progress the network by probing network buffers, or throttle sending messages (termed flow control). This tradeoff decides between optimizing the throughput of a runtime scheduler by increasing the rate of execution of algorithmic work, and reducing the latency of the network messages. Another trade-off exists when a decision has to be made about when to send aggregated messages in buffers (message coalescing). This decision chooses between trading off latency for network bandwidth and vice versa. At any instant, such trade-offs emphasize either on improving the quantity of work being executed (by maximizing the scheduler throughput) or on improving the quality of work (by prioritizing better work). However,more » encoding static policies for different runtime features (such as flow control, coalescing) can prevent graph algorithms from achieving their full potential, thus can undermine the actual performance of a distributed graph algorithm . In this paper, we investigate runtime support for distributed graph algorithms in the context of two paradigms: variants of wellknown Bulk-Synchronous Parallel model and asynchronous programming model. We explore generic runtime features such as message coalescing (aggregation) and flow control and show that execution policies of these features need to be adjusted over time to make a positive impact on the execution time of a distributed graph algorithm. Since synchronous and asynchronous graph algorithms have different workload characteristics, not all of such runtime features may be good candidates for adaptation. Each of these algorithmic paradigms may require different set of features to be adapted over time. We demonstrate which set of feature(s) can be useful in each case to achieve the right balance of work in the runtime layer. Existing implementation of different graph algorithms can benefit from adapting dynamic policies in the underlying runtime.« less

Authors:
 [1];  [1];  [1]; ORCiD logo [1]
  1. BATTELLE (PACIFIC NW LAB)
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1515042
Report Number(s):
PNNL-SA-138864
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: IEEE 25th International Conference on High Performance Computing (HiPC 2018), December 17-20. 2018, Bengaluru, India
Country of Publication:
United States
Language:
English

Citation Formats

Firoz, Jesun S., Zalewski, Marcin J., Suetterlein, Joshua D., and Lumsdaine, Andrew. Adaptive Runtime Features For Distributed Graph Algorithms. United States: N. p., 2018. Web. doi:10.1109/HiPC.2018.00018.
Firoz, Jesun S., Zalewski, Marcin J., Suetterlein, Joshua D., & Lumsdaine, Andrew. Adaptive Runtime Features For Distributed Graph Algorithms. United States. doi:10.1109/HiPC.2018.00018.
Firoz, Jesun S., Zalewski, Marcin J., Suetterlein, Joshua D., and Lumsdaine, Andrew. Mon . "Adaptive Runtime Features For Distributed Graph Algorithms". United States. doi:10.1109/HiPC.2018.00018.
@article{osti_1515042,
title = {Adaptive Runtime Features For Distributed Graph Algorithms},
author = {Firoz, Jesun S. and Zalewski, Marcin J. and Suetterlein, Joshua D. and Lumsdaine, Andrew},
abstractNote = {Performance of distributed graph algorithms can benefit greatly by forming rapport between algorithmic abstraction and the underlying runtime system that is responsible for scheduling work and exchanging messages. However, due to their dynamic and irregular nature of computation, distributed graph algorithms written in different programming models impose varying degree of workload pressure on the runtime. To cope with such vastly different workload characteristics, a runtime has to make several trade-offs. One such trade-off arises, for example, when the runtime scheduler has to choose among alternatives such as whether to execute algorithmic work, or progress the network by probing network buffers, or throttle sending messages (termed flow control). This tradeoff decides between optimizing the throughput of a runtime scheduler by increasing the rate of execution of algorithmic work, and reducing the latency of the network messages. Another trade-off exists when a decision has to be made about when to send aggregated messages in buffers (message coalescing). This decision chooses between trading off latency for network bandwidth and vice versa. At any instant, such trade-offs emphasize either on improving the quantity of work being executed (by maximizing the scheduler throughput) or on improving the quality of work (by prioritizing better work). However, encoding static policies for different runtime features (such as flow control, coalescing) can prevent graph algorithms from achieving their full potential, thus can undermine the actual performance of a distributed graph algorithm . In this paper, we investigate runtime support for distributed graph algorithms in the context of two paradigms: variants of wellknown Bulk-Synchronous Parallel model and asynchronous programming model. We explore generic runtime features such as message coalescing (aggregation) and flow control and show that execution policies of these features need to be adjusted over time to make a positive impact on the execution time of a distributed graph algorithm. Since synchronous and asynchronous graph algorithms have different workload characteristics, not all of such runtime features may be good candidates for adaptation. Each of these algorithmic paradigms may require different set of features to be adapted over time. We demonstrate which set of feature(s) can be useful in each case to achieve the right balance of work in the runtime layer. Existing implementation of different graph algorithms can benefit from adapting dynamic policies in the underlying runtime.},
doi = {10.1109/HiPC.2018.00018},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2018},
month = {12}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: