A DistributedMemory Package for Dense Hierarchically SemiSeparable Matrix Computations Using Randomization
In this paper, we present a distributedmemory library for computations with dense structured matrices. A matrix is considered structured if its offdiagonal blocks can be approximated by a rankdeficient matrix with low numerical rank. Here, we use Hierarchically SemiSeparable (HSS) representations. Such matrices appear in many applications, for example, finiteelement methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrixvector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrixvector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributedmemory sparse solver.
 Authors:

^{[1]};
^{[1]};
^{[1]};
^{[2]}
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
 Univ. libre de Bruxelles (ULB), Brussels (Belgium)
 Publication Date:
 Grant/Contract Number:
 AC0205CH11231
 Type:
 Accepted Manuscript
 Journal Name:
 ACM Transactions on Mathematical Software
 Additional Journal Information:
 Journal Volume: 42; Journal Issue: 4; Journal ID: ISSN 00983500
 Publisher:
 Association for Computing Machinery
 Research Org:
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
 Sponsoring Org:
 USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC21)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 98 NUCLEAR DISARMAMENT, SAFEGUARDS, AND PHYSICAL PROTECTION; mathematical software; solvers; design; algorithms; performance; HSS matrices; randomized sampling; ULV factorization; parallel algorithms; distributedmemory
 OSTI Identifier:
 1393046
Rouet, FrançoisHenry, Li, Xiaoye S., Ghysels, Pieter, and Napov, Artem. A DistributedMemory Package for Dense Hierarchically SemiSeparable Matrix Computations Using Randomization. United States: N. p.,
Web. doi:10.1145/2930660.
Rouet, FrançoisHenry, Li, Xiaoye S., Ghysels, Pieter, & Napov, Artem. A DistributedMemory Package for Dense Hierarchically SemiSeparable Matrix Computations Using Randomization. United States. doi:10.1145/2930660.
Rouet, FrançoisHenry, Li, Xiaoye S., Ghysels, Pieter, and Napov, Artem. 2016.
"A DistributedMemory Package for Dense Hierarchically SemiSeparable Matrix Computations Using Randomization". United States.
doi:10.1145/2930660. https://www.osti.gov/servlets/purl/1393046.
@article{osti_1393046,
title = {A DistributedMemory Package for Dense Hierarchically SemiSeparable Matrix Computations Using Randomization},
author = {Rouet, FrançoisHenry and Li, Xiaoye S. and Ghysels, Pieter and Napov, Artem},
abstractNote = {In this paper, we present a distributedmemory library for computations with dense structured matrices. A matrix is considered structured if its offdiagonal blocks can be approximated by a rankdeficient matrix with low numerical rank. Here, we use Hierarchically SemiSeparable (HSS) representations. Such matrices appear in many applications, for example, finiteelement methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrixvector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrixvector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributedmemory sparse solver.},
doi = {10.1145/2930660},
journal = {ACM Transactions on Mathematical Software},
number = 4,
volume = 42,
place = {United States},
year = {2016},
month = {6}
}