skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High Performance Computing Based Parallel HIearchical Modal Association Clustering (HPAR HMAC)

Software ·
OSTI ID:1365649
; ;  [1]
  1. Computational Sciences and Engineering Division, ORNL, Oak Ridge, TN 37830

For many applications, clustering is a crucial step in order to gain insight into the makeup of a dataset. The best approach to a given problem often depends on a variety of factors, such as the size of the dataset, time restrictions, and soft clustering requirements. The HMAC algorithm seeks to combine the strengths of 2 particular clustering approaches: model-based and linkage-based clustering. One particular weakness of HMAC is its computational complexity. HMAC is not practical for mega-scale data clustering. For high-definition imagery, a user would have to wait months or years for a result; for a 16-megapixel image, the estimated runtime skyrockets to over a decade! To improve the execution time of HMAC, it is reasonable to consider an multi-core implementation that utilizes available system resources. An existing imple-mentation (Ray and Cheng 2014) divides the dataset into N partitions - one for each thread prior to executing the HMAC algorithm. This implementation benefits from 2 types of optimization: parallelization and divide-and-conquer. By running each partition in parallel, the program is able to accelerate computation by utilizing more system resources. Although the parallel implementation provides considerable improvement over the serial HMAC, it still suffers from poor computational complexity, O(N2). Once the maximum number of cores on a system is exhausted, the program exhibits slower behavior. We now consider a modification to HMAC that involves a recursive partitioning scheme. Our modification aims to exploit divide-and-conquer benefits seen by the parallel HMAC implementation. At each level in the recursion tree, partitions are divided into 2 sub-partitions until a threshold size is reached. When the partition can no longer be divided without falling below threshold size, the base HMAC algorithm is applied. This results in a significant speedup over the parallel HMAC.

Short Name / Acronym:
HPAR HMAC; 005334WKSTN00
Version:
00
Programming Language(s):
Medium: X; OS: LINUX
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
Contributing Organization:
Dilip R. Patlolla and Sujithkumar Surendran Nair
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1365649
Country of Origin:
United States

Similar Records

Efficient Execution of Recursive Programs on Commodity Vector Hardware
Conference · Sat Jun 13 00:00:00 EDT 2015 · OSTI ID:1365649

Extracting SIMD Parallelism from Recursive Task-Parallel Programs
Journal Article · Mon Dec 02 00:00:00 EST 2019 · ACM Transactions on Parallel Computing · OSTI ID:1365649

Center for Technology for Advanced Scientific Componet Software (TASCS)
Technical Report · Sun Oct 31 00:00:00 EDT 2010 · OSTI ID:1365649

Related Subjects