Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Computation of Large Covariance Matrices by SAMMY on Graphical Processing Units and Multicore CPUs

Conference ·
OSTI ID:1018596

Computational power of Graphical Processing Units and multicore CPUs was harnessed by the nuclear data evaluation code SAMMY to speed up computations of large Resonance Parameter Covariance Matrices (RPCMs). This was accomplished by linking SAMMY to vendor-optimized implementations of the matrix-matrix multiplication subroutine of the Basic Linear Algebra Library to compute the most time-consuming step. The U-235 RPCM computed previously using a triple-nested loop was re-computed using the NVIDIA implementation of the subroutine on a single Tesla Fermi Graphical Processing Unit, and also using the Intel's Math Kernel Library implementation on two different multicore CPU systems. A multiplication of two matrices of dimensions 16,000 x 20,000 that had previously taken days, took approximately one minute on the GPU. Similar performance was achieved on a dual six-core CPU system. The magnitude of the speed-up suggests that these, or similar, combinations of hardware and libraries may be useful for large matrix operations in SAMMY. Uniform interfaces of standard linear algebra libraries make them a promising candidate for a programming framework of a new generation of SAMMY for the emerging heterogeneous computing platforms.

Research Organization:
Oak Ridge National Laboratory (ORNL)
Sponsoring Organization:
NNSA USDOE - National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1018596
Country of Publication:
United States
Language:
English

Similar Records

An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU
Journal Article · Sun Jan 04 23:00:00 EST 2015 · Computer Physics Communications · OSTI ID:1185465

A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression
Journal Article · Mon Sep 30 00:00:00 EDT 2024 · International Journal of High Performance Computing Applications · OSTI ID:2499469

MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT
Journal Article · Wed Dec 31 23:00:00 EST 2008 · Journal of Undergraduate Research · OSTI ID:1052114