Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Scalable in-memory RDFS closure on billions of triples.

Conference ·
OSTI ID:1021116

We present an RDFS closure algorithm, specifically designed and implemented on the Cray XMT supercomputer, that obtains inference rates of 13 million inferences per second on the largest system configuration we used. The Cray XMT, with its large global memory (4TB for our experiments), permits the construction of a conceptually straightforward algorithm, fundamentally a series of operations on a shared hash table. Each thread is given a partition of triple data to process, a dedicated copy of the ontology to apply to the data, and a reference to the hash table into which it inserts inferred triples. The global nature of the hash table allows the algorithm to avoid a common obstacle for distributed memory machines: the creation of duplicate triples. On LUBM data sets ranging between 1.3 billion and 5.3 billion triples, we obtain nearly linear speedup except for two portions: file I/O, which can be ameliorated with the additional service nodes, and data structure initialization, which requires nearly constant time for runs involving 32 processors or more.

Research Organization:
Sandia National Laboratories
Sponsoring Organization:
USDOE
DOE Contract Number:
AC04-94AL85000
OSTI ID:
1021116
Report Number(s):
SAND2010-4195C
Country of Publication:
United States
Language:
English

Similar Records

High-performance computing applied to semantic databases.
Conference · Tue Nov 30 23:00:00 EST 2010 · OSTI ID:1038170

High-performance Computing Applied to Semantic Databases
Conference · Thu Jun 02 00:00:00 EDT 2011 · OSTI ID:1018152

Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
Conference · Wed Jul 27 00:00:00 EDT 2011 · OSTI ID:1023734