Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

An efficient compression scheme for bitmap indices

Technical Report ·
DOI:https://doi.org/10.2172/841308· OSTI ID:841308

When using an out-of-core indexing method to answer a query, it is generally assumed that the I/O cost dominates the overall query response time. Because of this, most research on indexing methods concentrate on reducing the sizes of indices. For bitmap indices, compression has been used for this purpose. However, in most cases, operations on these compressed bitmaps, mostly bitwise logical operations such as AND, OR, and NOT, spend more time in CPU than in I/O. To speedup these operations, a number of specialized bitmap compression schemes have been developed; the best known of which is the byte-aligned bitmap code (BBC). They are usually faster in performing logical operations than the general purpose compression schemes, but, the time spent in CPU still dominates the total query response time. To reduce the query response time, we designed a CPU-friendly scheme named the word-aligned hybrid (WAH) code. In this paper, we prove that the sizes of WAH compressed bitmap indices are about two words per row for large range of attributes. This size is smaller than typical sizes of commonly used indices, such as a B-tree. Therefore, WAH compressed indices are not only appropriate for low cardinality attributes but also for high cardinality attributes.In the worst case, the time to operate on compressed bitmaps is proportional to the total size of the bitmaps involved. The total size of the bitmaps required to answer a query on one attribute is proportional to the number of hits. These indicate that WAH compressed bitmap indices are optimal. To verify their effectiveness, we generated bitmap indices for four different datasets and measured the response time of many range queries. Tests confirm that sizes of compressed bitmap indices are indeed smaller than B-tree indices, and query processing with WAH compressed indices is much faster than with BBC compressed indices, projection indices and B-tree indices. In addition, we also verified that the average query response time is proportional to the index size. This indicates that the compressed bitmap indices are efficient for very large datasets.

Research Organization:
Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
Sponsoring Organization:
USDOE Director. Office of Science. Office of Advanced Scientific Computing Research (US)
DOE Contract Number:
AC03-76SF00098
OSTI ID:
841308
Report Number(s):
LBNL--49626
Country of Publication:
United States
Language:
English

Similar Records

Compressing bitmap indexes for faster search operations
Conference · Thu Apr 25 00:00:00 EDT 2002 · OSTI ID:795969

FastBit Reference Manual
Technical Report · Thu Aug 02 00:00:00 EDT 2007 · OSTI ID:913270

On the performance of bitmap indices for high cardinality attributes
Conference · Thu Mar 04 23:00:00 EST 2004 · OSTI ID:822860