skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Manycore Performance-Portability: Kokkos Multidimensional Array Library

Abstract

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel execution performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].

Authors:
 [1];  [2];  [2];  [3];  [4]
  1. Computing Research Center, Sandia National Laboratories, Livermore, CA, USA
  2. Engineering Sciences Center, Sandia National Laboratories, Albuquerque, NM, USA
  3. Department of Electrical and Computer Engineering, Kansas State University, Manhattan, KS, USA
  4. Department of Mathematics, California State University, Los Angeles, CA, USA
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1197983
Grant/Contract Number:  
AC04-94AL85000; SAND2011-8102J
Resource Type:
Published Article
Journal Name:
Scientific Programming
Additional Journal Information:
Journal Name: Scientific Programming Journal Volume: 20 Journal Issue: 2; Journal ID: ISSN 1058-9244
Publisher:
Hindawi Publishing Corporation
Country of Publication:
Egypt
Language:
English

Citation Formats

Edwards, H. Carter, Sunderland, Daniel, Porter, Vicki, Amsler, Chris, and Mish, Sam. Manycore Performance-Portability: Kokkos Multidimensional Array Library. Egypt: N. p., 2012. Web. doi:10.1155/2012/917630.
Edwards, H. Carter, Sunderland, Daniel, Porter, Vicki, Amsler, Chris, & Mish, Sam. Manycore Performance-Portability: Kokkos Multidimensional Array Library. Egypt. doi:10.1155/2012/917630.
Edwards, H. Carter, Sunderland, Daniel, Porter, Vicki, Amsler, Chris, and Mish, Sam. Sun . "Manycore Performance-Portability: Kokkos Multidimensional Array Library". Egypt. doi:10.1155/2012/917630.
@article{osti_1197983,
title = {Manycore Performance-Portability: Kokkos Multidimensional Array Library},
author = {Edwards, H. Carter and Sunderland, Daniel and Porter, Vicki and Amsler, Chris and Mish, Sam},
abstractNote = {Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel execution performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].},
doi = {10.1155/2012/917630},
journal = {Scientific Programming},
number = 2,
volume = 20,
place = {Egypt},
year = {2012},
month = {1}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
DOI: 10.1155/2012/917630

Save / Share: