Massive-scale RDF Processing Using Compressed Bitmap Indexes
Conference
·
OSTI ID:1056553
The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scienti c data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern- nding queries on this implicit multigraph in a SQL- like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins e ciently, we propose a new strategy based on bitmap indexes. We store the RDF data in column-oriented structures as compressed bitmaps along with two dictionaries. This paper makes three new contributions. (i) We present an e cient parallel strategy for parsing the raw RDF data, building dictionaries of unique entities, and creating compressed bitmap indexes of the data. (ii) We utilize the constructed bitmap indexes to e ciently answer SPARQL queries, simplifying the join evaluations. (iii) To quantify the performance impact of using bitmap indexes, we compare our approach to the state-of-the-art triple-store RDF-3X. We nd that our bitmap index-based approach to answering queries is up to an order of magnitude faster for a variety of SPARQL queries, on gigascale RDF data sets.
- Research Organization:
- Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
- Sponsoring Organization:
- Computational Research Division
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1056553
- Report Number(s):
- LBNL-5316E
- Country of Publication:
- United States
- Language:
- English
Similar Records
Efficient Joins with Compressed Bitmap Indexes
An efficient compression scheme for bitmap indices
Breaking the Curse of Cardinality on Bitmap Indexes
Conference
·
Wed Aug 19 00:00:00 EDT 2009
·
OSTI ID:982896
An efficient compression scheme for bitmap indices
Technical Report
·
Tue Apr 13 00:00:00 EDT 2004
·
OSTI ID:841308
Breaking the Curse of Cardinality on Bitmap Indexes
Conference
·
Fri Apr 04 00:00:00 EDT 2008
·
OSTI ID:927150