DOE CODE / / LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0

LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0

Full Project

Abstract

Pairwise sequence alignment is one of the most computationally intensive kernels in genomic data analysis, accounting for more than 90% of the run time for key bioinformatics applications. This method is particularly expensive for third-generation sequences due to the high computational expense of analyzing these long read lengths (1Kb-1Mb). Given the quadratic overhead of exact pairwise algorithms such as Smith-Waterman, for long alignments, the community primarily relies on approximate algorithms that search only for high-quality alignments and stop early when one is not found. In this work, we present the first GPU optimization of the popular X-drop alignment algorithm, named LOGAN. Results show that our high-performance multi-GPU LOGAN implementation achieves up to 181.6 GCUPS and speed-ups up to 6.6x and 30.7x using 1 and 6 NVIDIA Tesla V100, respectively, over the state-of-the-art software running on two IBM Power9 processors using 168 threads, with equivalent accuracy. We also demonstrate a 2.3x LOGAN speed-up versus ksw2, a state-of-art vectorized algorithm for sequence alignment implemented in minimap2. To highlight the impact of our work on a real-world application, we couple the LOGAN aligner with a many-to-many long-read alignment software called BELLA, and demonstrate that our implementation improves the overall BELLA runtime by up More>>

Developers:

Guidi, Giulia ^[1] ; Zeni, Alberto ^[2]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Politecnico di Milano (Italy)

Release Date:: 2019-11-05

Project Type:: Open Source, Publicly Available Repository

Software Type:: Scientific

Licenses:: BSD 3-clause "New" or "Revised" License

Sponsoring Org.:: USDOE

Primary Award/Contract Number:

AC02-05CH11231

Other Award/Contract Number:

Oak Ridge National Laboratory, Contract No. AC05-00OR22725

Code ID:: 32464

Site Accession Number:: 2020-010

Research Org.:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Politecnico di Milano, Milan, Italy

Country of Origin:: United States

Citation Formats

Guidi, Giulia, and Zeni, Alberto. LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0. Computer Software. https://github.com/albertozeni/LOGAN. USDOE. 05 Nov. 2019. Web. doi:10.11578/dc.20191113.1.

Guidi, Giulia, & Zeni, Alberto. (2019, November 05). LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0. [Computer software]. https://github.com/albertozeni/LOGAN. https://doi.org/10.11578/dc.20191113.1.

Guidi, Giulia, and Zeni, Alberto. "LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0." Computer software. November 05, 2019. https://github.com/albertozeni/LOGAN. https://doi.org/10.11578/dc.20191113.1.

@misc{ doecode_32464,

title = {LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN) v1.0},

author = {Guidi, Giulia and Zeni, Alberto},

abstractNote = {Pairwise sequence alignment is one of the most computationally intensive kernels in genomic data analysis, accounting for more than 90% of the run time for key bioinformatics applications. This method is particularly expensive for third-generation sequences due to the high computational expense of analyzing these long read lengths (1Kb-1Mb). Given the quadratic overhead of exact pairwise algorithms such as Smith-Waterman, for long alignments, the community primarily relies on approximate algorithms that search only for high-quality alignments and stop early when one is not found. In this work, we present the first GPU optimization of the popular X-drop alignment algorithm, named LOGAN. Results show that our high-performance multi-GPU LOGAN implementation achieves up to 181.6 GCUPS and speed-ups up to 6.6x and 30.7x using 1 and 6 NVIDIA Tesla V100, respectively, over the state-of-the-art software running on two IBM Power9 processors using 168 threads, with equivalent accuracy. We also demonstrate a 2.3x LOGAN speed-up versus ksw2, a state-of-art vectorized algorithm for sequence alignment implemented in minimap2. To highlight the impact of our work on a real-world application, we couple the LOGAN aligner with a many-to-many long-read alignment software called BELLA, and demonstrate that our implementation improves the overall BELLA runtime by up to 10.6x. Finally, we adapt the Roofline model for our optimized kernel and demonstrate that our implementation is near-optimal on the NVIDIA Tesla V100s.},

doi = {10.11578/dc.20191113.1},

url = {https://doi.org/10.11578/dc.20191113.1},

howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20191113.1}},

year = {2019},

month = {nov}

}

RESOURCE

SAVE / SHARE

Abstract

RESOURCE

SAVE / SHARE

Citation Formats