Composing Data Parallel Code for a SPARQL Graph Engine

Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste; Haglin, David J.; Feo, John

doi:10.1109/SocialCom.2013.104

Title: Composing Data Parallel Code for a SPARQL Graph Engine

Full Record
Other Related Research

Abstract

Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basic graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.

Authors:: Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste; Haglin, David J.; Feo, John

Publication Date:: Sun Sep 08 00:00:00 EDT 2013

Research Org.:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Org.:: USDOE

OSTI Identifier:: 1123246

Report Number(s):: PNNL-SA-96193
400470000

DOE Contract Number:: AC05-76RL01830

Resource Type:: Conference

Resource Relation:: Conference: IEEE International Conference on Social Computing (SocialCom 2013), September 8-14, 2013, Alexandria, Virginia, 691-699

Country of Publication:: United States

Language:: English

Subject:: SPARQL; Big data; SPARQL-to-C

Citation Formats


                    Castellana, Vito G., Tumeo, Antonino, Villa, Oreste, Haglin, David J., and Feo, John. Composing Data Parallel Code for a SPARQL Graph Engine.  United States: N. p., 2013. 
        Web.  doi:10.1109/SocialCom.2013.104.

Copy to clipboard


                    Castellana, Vito G., Tumeo, Antonino, Villa, Oreste, Haglin, David J., & Feo, John. Composing Data Parallel Code for a SPARQL Graph Engine.  United States.  https://doi.org/10.1109/SocialCom.2013.104

Copy to clipboard


                    Castellana, Vito G., Tumeo, Antonino, Villa, Oreste, Haglin, David J., and Feo, John. 2013.  
        "Composing Data Parallel Code for a SPARQL Graph Engine".  United States.  https://doi.org/10.1109/SocialCom.2013.104.

Copy to clipboard


                    
@article{osti_1123246,

  title        = {Composing Data Parallel Code for a SPARQL Graph Engine},

  author       = {Castellana, Vito G. and Tumeo, Antonino and Villa, Oreste and Haglin, David J. and Feo, John},

  abstractNote = {Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basic graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.},

  doi          = {10.1109/SocialCom.2013.104},

  url          = {https://www.osti.gov/biblio/1123246},
  journal      = {},
number       = ,

  volume       = ,

  place        = {United States},

  year         = {Sun Sep 08 00:00:00 EDT 2013},

  month        = {Sun Sep 08 00:00:00 EDT 2013}

}

Copy to clipboard

Conference:

https://doi.org/10.1109/SocialCom.2013.104

Other availability

Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Export Metadata

Save to My Library

Similar records in OSTI.GOV collections:

Efficient Synthesis of Graph Methods: a Dynamically Scheduled Architecture

Conference Minutoli, Marco; Castellana, Vito; Tumeo, Antonino; ...

RDF databases naturally map to a graph representation and employ languages, such as SPARQL, that implements queries as graph pattern matching routines. Graph methods exhibit an irregular behavior: they present unpredictable, fine-grained data accesses, and are synchronization inten- sive. Graph data structures expose large amounts of dy- namic parallelism, but are difficult to partition without gen- erating load unbalance. In this paper, we present a novel ar- chitecture to improve the synthesis of graph methods. Our design addresses the issues of these algorithms with two com- ponents: a Dynamic Task Scheduler (DTS), which reduces load unbalance and maximize resource utilization,more »« less
https://doi.org/10.1145/2966986.2967030
High Level Synthesis of RDF Queries for Graph Analytics

Conference Castellana, Vito; Minutoli, Marco; Morari, Alessandro; ...

In this paper we present a set of techniques that enable the synthesis of efficient custom accelerators for memory intensive, irregular applications. To address irregular applications challenges (large memory footprints, unpredictable fine- grained data accesses, and high synchronization intensity), and exploit their opportunities (thread level parallelism, memory level parallelism), we propose a novel accelerator design which take advantage of an adaptive and Distributed Controller (DC) architecture, and a Memory Interface (MI) that supports parallel memory subsystems. Among the multitude of algorithms that may benefit from our solution, we focus on the acceleration of graph analytics applications, and in particular, onmore »« less
https://doi.org/10.1109/ICCAD.2015.7372587
An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs with Disjointness Constraints

Conference Zhang, Guo; Luo, Lingyun; Ogbuji, Chime; ...

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions. MOCH represents patterns of multitype interaction as small labeled sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology (OWL, RDF and SPARQL) andmore »« less
Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis

Journal Article Lee, Sangkeun; Sukumar, Sreenivas; Hong, Seokyong; ... - Expert Systems with Applications

The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore »« less
Cited by 8
https://doi.org/10.1016/j.eswa.2015.11.010

Full Text Available
EAGLE: 'EAGLE'Is an' Algorithmic Graph Library for Exploration

Software

The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. Today there is no tools to conduct "graph mining" on RDF standard data sets. We address that need through implementation of popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, degree distribution,more »« less
View Software

Similar Records