skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Parallel Graph Environment for Real-World Data Analytics Workflows

Abstract

Economic competitiveness and national security depend increasingly on the insightful analysis of large data sets. The diversity of real-world data sources and analytic workflows impose challenging hardware and software requirements for parallel graph platforms. The irregular nature of graph methods is not supported well by the deep memory hierarchies of conventional distributed systems, requiring new processor and runtime system designs to tolerate memory and synchronization latencies. Moreover, the efficiency of relational table operations and matrix computations are not attainable when data is stored in common graph data structures. In this paper, we present HAGGLE, a high-performance, scalable data analytics platform. The platform’s hybrid data model supports a variety of distributed, thread-safe data structures, parallel programming constructs, and persistent and streaming data. An abstract runtime layer enables us to map the stack to conventional, distributed computer systems with accelerators. The runtime uses multithreading, active messages, and data aggregation to hide memory and synchronization latencies on large-scale systems.

Authors:
ORCiD logo [1];  [1];  [1];  [2];  [2]; ORCiD logo [1]; ORCiD logo [1];  [1]; ORCiD logo [1];  [1];  [1];  [1]
  1. BATTELLE (PACIFIC NW LAB)
  2. Indiana University-Bloomington
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1591771
Report Number(s):
PNNL-SA-140268
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: Design, Automation & Test in Europe Conference & Exhibition (DATE 2019), March 25-29, 2019, Florence, Italy
Country of Publication:
United States
Language:
English
Subject:
Graph Analytics

Citation Formats

Castellana, Vito G., Drocco, Maurizio, Feo, John T., Firoz, Jesun, Kanewala, Thejaka A., Lumsdaine, Andrew, Manzano Franco, Joseph B., Marquez, Andres, Minutoli, Marco, Suetterlein, Joshua D., Tumeo, Antonino, and Zalewski, Marcin J. A Parallel Graph Environment for Real-World Data Analytics Workflows. United States: N. p., 2019. Web. doi:10.23919/DATE.2019.8715196.
Castellana, Vito G., Drocco, Maurizio, Feo, John T., Firoz, Jesun, Kanewala, Thejaka A., Lumsdaine, Andrew, Manzano Franco, Joseph B., Marquez, Andres, Minutoli, Marco, Suetterlein, Joshua D., Tumeo, Antonino, & Zalewski, Marcin J. A Parallel Graph Environment for Real-World Data Analytics Workflows. United States. doi:10.23919/DATE.2019.8715196.
Castellana, Vito G., Drocco, Maurizio, Feo, John T., Firoz, Jesun, Kanewala, Thejaka A., Lumsdaine, Andrew, Manzano Franco, Joseph B., Marquez, Andres, Minutoli, Marco, Suetterlein, Joshua D., Tumeo, Antonino, and Zalewski, Marcin J. Mon . "A Parallel Graph Environment for Real-World Data Analytics Workflows". United States. doi:10.23919/DATE.2019.8715196.
@article{osti_1591771,
title = {A Parallel Graph Environment for Real-World Data Analytics Workflows},
author = {Castellana, Vito G. and Drocco, Maurizio and Feo, John T. and Firoz, Jesun and Kanewala, Thejaka A. and Lumsdaine, Andrew and Manzano Franco, Joseph B. and Marquez, Andres and Minutoli, Marco and Suetterlein, Joshua D. and Tumeo, Antonino and Zalewski, Marcin J.},
abstractNote = {Economic competitiveness and national security depend increasingly on the insightful analysis of large data sets. The diversity of real-world data sources and analytic workflows impose challenging hardware and software requirements for parallel graph platforms. The irregular nature of graph methods is not supported well by the deep memory hierarchies of conventional distributed systems, requiring new processor and runtime system designs to tolerate memory and synchronization latencies. Moreover, the efficiency of relational table operations and matrix computations are not attainable when data is stored in common graph data structures. In this paper, we present HAGGLE, a high-performance, scalable data analytics platform. The platform’s hybrid data model supports a variety of distributed, thread-safe data structures, parallel programming constructs, and persistent and streaming data. An abstract runtime layer enables us to map the stack to conventional, distributed computer systems with accelerators. The runtime uses multithreading, active messages, and data aggregation to hide memory and synchronization latencies on large-scale systems.},
doi = {10.23919/DATE.2019.8715196},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {3}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: