Scalable in-memory RDFS closure on billions of triples.
Conference
·
OSTI ID:1021116
- Cray, Inc, Seattle, WA
We present an RDFS closure algorithm, specifically designed and implemented on the Cray XMT supercomputer, that obtains inference rates of 13 million inferences per second on the largest system configuration we used. The Cray XMT, with its large global memory (4TB for our experiments), permits the construction of a conceptually straightforward algorithm, fundamentally a series of operations on a shared hash table. Each thread is given a partition of triple data to process, a dedicated copy of the ontology to apply to the data, and a reference to the hash table into which it inserts inferred triples. The global nature of the hash table allows the algorithm to avoid a common obstacle for distributed memory machines: the creation of duplicate triples. On LUBM data sets ranging between 1.3 billion and 5.3 billion triples, we obtain nearly linear speedup except for two portions: file I/O, which can be ameliorated with the additional service nodes, and data structure initialization, which requires nearly constant time for runs involving 32 processors or more.
- Research Organization:
- Sandia National Laboratories
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1021116
- Report Number(s):
- SAND2010-4195C
- Country of Publication:
- United States
- Language:
- English
Similar Records
High-performance computing applied to semantic databases.
High-performance Computing Applied to Semantic Databases
Scalable in-memory RDFS closure on billions of triples.
Conference
·
Tue Nov 30 23:00:00 EST 2010
·
OSTI ID:1038170
High-performance Computing Applied to Semantic Databases
Conference
·
Thu Jun 02 00:00:00 EDT 2011
·
OSTI ID:1018152
Scalable in-memory RDFS closure on billions of triples.
Conference
·
Mon Nov 01 00:00:00 EDT 2010
·
OSTI ID:1030384