skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SharP Data Constructs: Data Constructs to Enable Data-Centric Computing

Abstract

Extreme-scale applications (i.e., Big-Compute) are becoming increasingly data-intensive, i.e., producing and consuming increasingly large amounts of data. The HPC systems traditionally used for these applications are now used for Big-Data applications such as data analytics, social network analysis, machine learning, and genomics. As a consequence of these trends, the system architecture should be flexible and data-centric. This can already be witnessed in the pre-exascale systems with TBs of on-node hierarchical and heterogeneous memories, PBs of system memory, low-latency, high-throughput networks, and many threaded cores. As such, the pre-exascale systems suit the needs of both Big-Compute and Big-Data applications. Though the system architecture is flexible enough to support both Big-Compute and Big-Data, we argue there is a software gap. Particularly, we need data-centric abstractions to leverage the full potential of the system, i.e., there is a need for native support for data resilience, the ability to express data locality and affinity, mechanisms to reduce data movement, the ability to share data, and abstractions to express User's data usage and data access patterns. In this paper, we (i) show the need for taking a holistic approach towards data-centric abstractions, (ii) show how these approaches were realized in the SHARed data-structure centric Programmingmore » abstraction (SharP) library, a data-structure centric programming abstraction, and (iii) apply these approaches to a variety of applications that demonstrate its usefulness. Particularly, we apply these approaches to QMCPack and the Graph500 benchmark and demonstrate the advantages of this approach on extreme-scale systems.« less

Authors:
ORCiD logo [1]; ORCiD logo [1];  [2]
  1. ORNL
  2. Tennessee Technological University (TTU)
Publication Date:
Research Org.:
Oak Ridge National Laboratory, Oak Ridge Leadership Computing Facility (OLCF); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1462834
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: The 26th Euromicro International Conference on Parallel, Distributed, and Network-based Processing (PDP) 2018 - Cambridge, , United Kingdom - 3/21/2018 4:00:00 AM-3/23/2018 4:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Aderholdt, William Ferrol, Gorentla Venkata, Manjunath, and Parchman, Zachary. SharP Data Constructs: Data Constructs to Enable Data-Centric Computing. United States: N. p., 2018. Web. doi:10.1109/PDP2018.2018.00031.
Aderholdt, William Ferrol, Gorentla Venkata, Manjunath, & Parchman, Zachary. SharP Data Constructs: Data Constructs to Enable Data-Centric Computing. United States. doi:10.1109/PDP2018.2018.00031.
Aderholdt, William Ferrol, Gorentla Venkata, Manjunath, and Parchman, Zachary. Thu . "SharP Data Constructs: Data Constructs to Enable Data-Centric Computing". United States. doi:10.1109/PDP2018.2018.00031. https://www.osti.gov/servlets/purl/1462834.
@article{osti_1462834,
title = {SharP Data Constructs: Data Constructs to Enable Data-Centric Computing},
author = {Aderholdt, William Ferrol and Gorentla Venkata, Manjunath and Parchman, Zachary},
abstractNote = {Extreme-scale applications (i.e., Big-Compute) are becoming increasingly data-intensive, i.e., producing and consuming increasingly large amounts of data. The HPC systems traditionally used for these applications are now used for Big-Data applications such as data analytics, social network analysis, machine learning, and genomics. As a consequence of these trends, the system architecture should be flexible and data-centric. This can already be witnessed in the pre-exascale systems with TBs of on-node hierarchical and heterogeneous memories, PBs of system memory, low-latency, high-throughput networks, and many threaded cores. As such, the pre-exascale systems suit the needs of both Big-Compute and Big-Data applications. Though the system architecture is flexible enough to support both Big-Compute and Big-Data, we argue there is a software gap. Particularly, we need data-centric abstractions to leverage the full potential of the system, i.e., there is a need for native support for data resilience, the ability to express data locality and affinity, mechanisms to reduce data movement, the ability to share data, and abstractions to express User's data usage and data access patterns. In this paper, we (i) show the need for taking a holistic approach towards data-centric abstractions, (ii) show how these approaches were realized in the SHARed data-structure centric Programming abstraction (SharP) library, a data-structure centric programming abstraction, and (iii) apply these approaches to a variety of applications that demonstrate its usefulness. Particularly, we apply these approaches to QMCPack and the Graph500 benchmark and demonstrate the advantages of this approach on extreme-scale systems.},
doi = {10.1109/PDP2018.2018.00031},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2018},
month = {3}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: