DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms

Abstract

In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelize the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.

Authors:
 [1];  [2];  [2];  [2];  [2];  [2];  [2];  [3]
  1. Univ. of California, Los Angeles, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  3. Univ. of California, Los Angeles, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF); US Air Force Office of Scientific Research (AFOSR)
OSTI Identifier:
1378982
Grant/Contract Number:  
AC02-05CH11231; DMS-1417674; DMS-1045536; FA9550-10-1-0569
Resource Type:
Accepted Manuscript
Journal Name:
Lecture Notes in Computer Science
Additional Journal Information:
Journal Volume: 9903; Journal ID: ISSN 0302-9743
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; semi-supervised; unsupervised; data; algorithms; OpenMP; optimization

Citation Formats

Meng, Zhaoyi, Koniges, Alice, He, Yun Helen, Williams, Samuel, Kurth, Thorsten, Cook, Brandon, Deslippe, Jack, and Bertozzi, Andrea L. OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms. United States: N. p., 2016. Web. doi:10.1007/978-3-319-45550-1_2.
Meng, Zhaoyi, Koniges, Alice, He, Yun Helen, Williams, Samuel, Kurth, Thorsten, Cook, Brandon, Deslippe, Jack, & Bertozzi, Andrea L. OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms. United States. https://doi.org/10.1007/978-3-319-45550-1_2
Meng, Zhaoyi, Koniges, Alice, He, Yun Helen, Williams, Samuel, Kurth, Thorsten, Cook, Brandon, Deslippe, Jack, and Bertozzi, Andrea L. Wed . "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms". United States. https://doi.org/10.1007/978-3-319-45550-1_2. https://www.osti.gov/servlets/purl/1378982.
@article{osti_1378982,
title = {OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms},
author = {Meng, Zhaoyi and Koniges, Alice and He, Yun Helen and Williams, Samuel and Kurth, Thorsten and Cook, Brandon and Deslippe, Jack and Bertozzi, Andrea L.},
abstractNote = {In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelize the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.},
doi = {10.1007/978-3-319-45550-1_2},
journal = {Lecture Notes in Computer Science},
number = ,
volume = 9903,
place = {United States},
year = {Wed Sep 21 00:00:00 EDT 2016},
month = {Wed Sep 21 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Multi-class Graph Mumford-Shah Model for Plume Detection Using the MBO scheme
book, January 2015


Diffuse Interface Models on Graphs for Classification of High Dimensional Data
journal, January 2012

  • Bertozzi, Andrea L.; Flenner, Arjuna
  • Multiscale Modeling & Simulation, Vol. 10, Issue 3
  • DOI: 10.1137/11083109X

An MBO Scheme on Graphs for Classification and Image Processing
journal, January 2013

  • Merkurjev, Ekaterina; Kostić, Tijana; Bertozzi, Andrea L.
  • SIAM Journal on Imaging Sciences, Vol. 6, Issue 4
  • DOI: 10.1137/120886935

Applied Numerical Linear Algebra
book, January 1997


Diffuse Interface Models on Graphs for Classification of High Dimensional Data
journal, January 2016

  • Bertozzi, Andrea L.; Flenner, Arjuna
  • SIAM Review, Vol. 58, Issue 2
  • DOI: 10.1137/16M1070426

Roofline: an insightful visual performance model for multicore architectures
journal, April 2009

  • Williams, Samuel; Waterman, Andrew; Patterson, David
  • Communications of the ACM, Vol. 52, Issue 4
  • DOI: 10.1145/1498765.1498785

Spectral grouping using the nystrom method
journal, February 2004

  • Fowlkes, C.; Belongie, S.
  • IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, Issue 2
  • DOI: 10.1109/TPAMI.2004.1262185

A simple min-cut algorithm
journal, July 1997


A tutorial on spectral clustering
journal, August 2007