skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Large Scale Frequent Pattern Mining using MPI One-Sided Model

Conference ·

In this paper, we propose a work-stealing runtime --- Library for Work Stealing LibWS --- using MPI one-sided model for designing scalable FP-Growth --- {\em de facto} frequent pattern mining algorithm --- on large scale systems. LibWS provides locality efficient and highly scalable work-stealing techniques for load balancing on a variety of data distributions. We also propose a novel communication algorithm for FP-growth data exchange phase, which reduces the communication complexity from state-of-the-art O(p) to O(f + p/f) for p processes and f frequent attributed-ids. FP-Growth is implemented using LibWS and evaluated on several work distributions and support counts. An experimental evaluation of the FP-Growth on LibWS using 4096 processes on an InfiniBand Cluster demonstrates excellent efficiency for several work distributions (87\% efficiency for Power-law and 91% for Poisson). The proposed distributed FP-Tree merging algorithm provides 38x communication speedup on 4096 cores.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1236334
Report Number(s):
PNNL-SA-110836
Resource Relation:
Conference: IEEE International Conference on Cluster Computing (CLUSTER 2015), September 8-11, 2015, Chicago, Illinois, 138-147
Country of Publication:
United States
Language:
English

Similar Records

A Case for Application Oblivious Energy-Efficient MPI Runtime
Conference · Mon Oct 19 00:00:00 EDT 2015 · OSTI ID:1236334

Fault Tolerant Frequent Pattern Mining
Conference · Mon Dec 19 00:00:00 EST 2016 · OSTI ID:1236334

Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM
Conference · Wed Aug 05 00:00:00 EDT 2015 · OSTI ID:1236334

Related Subjects