skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms

Journal Article · · Lecture Notes in Computer Science
 [1];  [2];  [2];  [2];  [2];  [2];  [2];  [3]
  1. Univ. of California, Los Angeles, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  3. Univ. of California, Los Angeles, CA (United States)

In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelize the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF); US Air Force Office of Scientific Research (AFOSR)
Grant/Contract Number:
AC02-05CH11231; DMS-1417674; DMS-1045536; FA9550-10-1-0569
OSTI ID:
1378982
Journal Information:
Lecture Notes in Computer Science, Vol. 9903; ISSN 0302-9743
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

References (9)

Multi-class Graph Mumford-Shah Model for Plume Detection Using the MBO scheme book January 2015
Diffuse Interface Models on Graphs for Classification of High Dimensional Data journal January 2012
An MBO Scheme on Graphs for Classification and Image Processing journal January 2013
Applied Numerical Linear Algebra book January 1997
Diffuse Interface Models on Graphs for Classification of High Dimensional Data journal January 2016
Roofline: an insightful visual performance model for multicore architectures journal April 2009
Spectral grouping using the nystrom method journal February 2004
A simple min-cut algorithm journal July 1997
A tutorial on spectral clustering journal August 2007