skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Block-Parallel Data Analysis with DIY2

Abstract

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial, parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.

Authors:
 [1];  [2]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Argonne National Lab. (ANL), Argonne, IL (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1377403
Report Number(s):
LBNL-1005149
ir:1005149
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Morozov, Dmitriy, and Peterka, Tom. Block-Parallel Data Analysis with DIY2. United States: N. p., 2017. Web. doi:10.2172/1377403.
Morozov, Dmitriy, & Peterka, Tom. Block-Parallel Data Analysis with DIY2. United States. doi:10.2172/1377403.
Morozov, Dmitriy, and Peterka, Tom. Wed . "Block-Parallel Data Analysis with DIY2". United States. doi:10.2172/1377403. https://www.osti.gov/servlets/purl/1377403.
@article{osti_1377403,
title = {Block-Parallel Data Analysis with DIY2},
author = {Morozov, Dmitriy and Peterka, Tom},
abstractNote = {DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial, parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.},
doi = {10.2172/1377403},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Wed Aug 30 00:00:00 EDT 2017},
month = {Wed Aug 30 00:00:00 EDT 2017}
}

Technical Report:

Save / Share: