skip to main content

Title: Query optimization for graph analytics on linked data using SPARQL

Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.
 [1] ;  [1] ;  [1] ;  [1] ;  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Technical Report
Research Org:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org:
USDOE Laboratory Directed Research and Development (LDRD) Program
Country of Publication:
United States
urika; SPARQL; big data; semantic web