skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: HALA: Handy Accelerated Linear Algebra

Abstract

Accelerated linear algebra libraries often come with C-style of interfaces or even older Fortran77 standards, which makes it difficult to use within generic template programming environment of modern C++. For example, the BLAS and derived standards such as cuBLAS do not accept overloads and a different functions have to be called for each input type, thus it is very challenging to write a single template that handles multiple precision modes and works well with different C++ containers. HALA offers a series of templates that wrap around the BLAS/cuBLAS and other similar methods, automatically infer the relevant types and call the appropriate back-end. The templates work out-of-the-box with all C++ standard vector-like containers and can be easily extended (with template specializations) to handle user-provided containers. The HALA API is also overloaded to handle both GPU and CPU cases with a single "engine" class, thus a single high-level algorithms can utilize both CPU or GPU backends, i.e., the HALA API is unified and handles the variations between cuBLAS and BLAS. In addition to BLAS and cuBLAS, wrapper templates are provided for subset of LAPACK, cuSparse, and Cholmod methods. Work is in progress to complete as many of these standards as possible. HALAmore » also includes a module for low-level extended register vectorization for complex number arithmetic and several simple solvers that build on the basic linear algebra capabilities. A single unified C++ API will benefit fast prototyping and easily writing code that is portable across GPU/CPU platforms as well as precision types and user containers. Note that HALA does not provide any accelerated algorithms, it simply serves as a "handy" front-end to other standards.« less

Developers:
 [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Release Date:
Project Type:
Open Source, Publicly Available Repository
Software Type:
Scientific
Programming Languages:
C++ (gcc6 and 7 or clang 5 and 6)
Licenses:
BSD 3-clause "New" or "Revised" License
Sponsoring Org.:
USDOE

Primary Award/Contract Number:
AC05-00OR22725
Code ID:
57394
Site Accession Number:
8142
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Country of Origin:
United States

Citation Formats

Stoyanov, Miroslav K., and USDOE. HALA: Handy Accelerated Linear Algebra. Computer software. https://www.osti.gov//servlets/purl/1630728. USDOE. 21 Nov. 2019. Web. doi:10.11578/dc.20210521.117.
Stoyanov, Miroslav K., & USDOE. (2019, November 21). HALA: Handy Accelerated Linear Algebra [Computer software]. https://www.osti.gov//servlets/purl/1630728. https://doi.org/10.11578/dc.20210521.117
Stoyanov, Miroslav K., and USDOE. HALA: Handy Accelerated Linear Algebra. Computer software. November 21, 2019. https://www.osti.gov//servlets/purl/1630728. doi:https://doi.org/10.11578/dc.20210521.117.
@misc{osti_1630728,
title = {HALA: Handy Accelerated Linear Algebra},
author = {Stoyanov, Miroslav K. and USDOE},
abstractNote = {Accelerated linear algebra libraries often come with C-style of interfaces or even older Fortran77 standards, which makes it difficult to use within generic template programming environment of modern C++. For example, the BLAS and derived standards such as cuBLAS do not accept overloads and a different functions have to be called for each input type, thus it is very challenging to write a single template that handles multiple precision modes and works well with different C++ containers. HALA offers a series of templates that wrap around the BLAS/cuBLAS and other similar methods, automatically infer the relevant types and call the appropriate back-end. The templates work out-of-the-box with all C++ standard vector-like containers and can be easily extended (with template specializations) to handle user-provided containers. The HALA API is also overloaded to handle both GPU and CPU cases with a single "engine" class, thus a single high-level algorithms can utilize both CPU or GPU backends, i.e., the HALA API is unified and handles the variations between cuBLAS and BLAS. In addition to BLAS and cuBLAS, wrapper templates are provided for subset of LAPACK, cuSparse, and Cholmod methods. Work is in progress to complete as many of these standards as possible. HALA also includes a module for low-level extended register vectorization for complex number arithmetic and several simple solvers that build on the basic linear algebra capabilities. A single unified C++ API will benefit fast prototyping and easily writing code that is portable across GPU/CPU platforms as well as precision types and user containers. Note that HALA does not provide any accelerated algorithms, it simply serves as a "handy" front-end to other standards.},
url = {https://www.osti.gov//servlets/purl/1630728},
doi = {10.11578/dc.20210521.117},
url = {https://www.osti.gov/biblio/1630728}, year = {2019},
month = {11},
note =
}