Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Hash based parallel algorithms for mining association rules

Conference ·
OSTI ID:535539
In this paper, we propose four parallel algorithms (NPA, SPA, HPA and RPA-ELD) for mining association rules on shared-nothing parallel machines to improve its performance. In NPA, candidate itemsets are just copied amongst all the processors, which can lead to memory overflow for large transaction databases. The remaining three algorithms partition the candidate itemsets over the processors. If it is partitioned simply (SPA), transaction data has to be broadcast to all processors. HPA partitions the candidate itemsets using a hash function to eliminate broadcasting, which also reduces the comparison workload significantly. HPA-ELD fully utilizes the available memory space by detecting the extremely large itemsets and copying them, which is also very effective at flattering the load over the processors. We implemented these algorithms in a shared-nothing environment. Performance evaluations show that the best algorithm, HPA-ELD, attains good linearity on speedup ratio and is effective for handling skew.
OSTI ID:
535539
Report Number(s):
CONF-961209--
Country of Publication:
United States
Language:
English

Similar Records

Hashing strategies for the Cray XMT.
Conference · Thu Apr 01 00:00:00 EDT 2010 · OSTI ID:1001006

Scalable in-memory RDFS closure on billions of triples.
Conference · Tue Jun 01 00:00:00 EDT 2010 · OSTI ID:1021116

Numerical integration of hydraulic network conservation equations on MIMD parallel computers
Journal Article · Fri Mar 31 23:00:00 EST 1989 · Nuclear Science and Engineering; (USA) · OSTI ID:5395751