skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Scalable Out-of-Core Solvers on Xeon Phi Cluster

Abstract

This paper documents the implementation of a distributive out-of-core (OOC) solver for performing LU and Cholesky factorizations of a large dense matrix on clusters of many-core programmable co-processors. The out-of- core algorithm combines both the left-looking and right-looking schemes aimed to minimize the movement of data between the CPU host and the co-processor, optimizing data locality as well as computing throughput. The OOC solver is built to align with the format of the ScaLAPACK software library, making it readily portable to any existing codes using ScaLAPACK. A runtime analysis conducted on Beacon (an Intel Xeon plus Intel Xeon Phi cluster which composed of 48 nodes of multi-core CPU and MIC) at the Na- tional Institute for Computational Sciences is presented. Comparison of the performance on the Intel Xeon Phi and GPU clusters are also provided.

Authors:
 [1];  [2];  [3];  [1]
  1. ORNL
  2. Chinese University of Hong Kong (CUHK)
  3. Center for Computational Materials Science
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Laboratory Directed Research and Development (LDRD) Program
OSTI Identifier:
1324039
DOE Contract Number:
AC05-00OR22725
Resource Type:
Book
Country of Publication:
United States
Language:
English

Citation Formats

D'Azevedo, Ed F, Chan, Ki Shing, Su, Shiquan, and Wong, Kwai. Scalable Out-of-Core Solvers on Xeon Phi Cluster. United States: N. p., 2015. Web.
D'Azevedo, Ed F, Chan, Ki Shing, Su, Shiquan, & Wong, Kwai. Scalable Out-of-Core Solvers on Xeon Phi Cluster. United States.
D'Azevedo, Ed F, Chan, Ki Shing, Su, Shiquan, and Wong, Kwai. Thu . "Scalable Out-of-Core Solvers on Xeon Phi Cluster". United States. doi:.
@article{osti_1324039,
title = {Scalable Out-of-Core Solvers on Xeon Phi Cluster},
author = {D'Azevedo, Ed F and Chan, Ki Shing and Su, Shiquan and Wong, Kwai},
abstractNote = {This paper documents the implementation of a distributive out-of-core (OOC) solver for performing LU and Cholesky factorizations of a large dense matrix on clusters of many-core programmable co-processors. The out-of- core algorithm combines both the left-looking and right-looking schemes aimed to minimize the movement of data between the CPU host and the co-processor, optimizing data locality as well as computing throughput. The OOC solver is built to align with the format of the ScaLAPACK software library, making it readily portable to any existing codes using ScaLAPACK. A runtime analysis conducted on Beacon (an Intel Xeon plus Intel Xeon Phi cluster which composed of 48 nodes of multi-core CPU and MIC) at the Na- tional Institute for Computational Sciences is presented. Comparison of the performance on the Intel Xeon Phi and GPU clusters are also provided.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}

Book:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this book.

Save / Share: