---
code_id: 32464
site_ownership_code: "LBNL"
open_source: true
repository_link: "https://github.com/albertozeni/LOGAN"
project_type: "OS"
software_type: "S"
official_use_only: {}
developers:
- email: "GGuidi@lbl.gov"
  orcid: ""
  first_name: "Giulia"
  last_name: "Guidi"
  middle_name: ""
  affiliations:
  - "Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)"
- email: "alberto.zeni@mail.polimi.it"
  orcid: ""
  first_name: "Alberto"
  last_name: "Zeni"
  middle_name: ""
  affiliations:
  - "Politecnico di Milano (Italy)"
contributors: []
sponsoring_organizations:
- organization_name: "USDOE"
  funding_identifiers:
  - identifier_type: "AwardNumber"
    identifier_value: "Oak Ridge National Laboratory, Contract No. AC05-00OR22725"
  primary_award: "AC02-05CH11231"
  DOE: true
contributing_organizations: []
research_organizations:
- organization_name: "Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United\
    \ States)"
  DOE: true
- organization_name: "Politecnico di Milano, Milan, Italy"
  DOE: false
related_identifiers: []
release_date: "2019-11-05"
software_title: "LOGAN: High-Performance X-Drop Pairwise Alignment on GPU (LOGAN)\
  \ v1.0"
acronym: "LOGAN v1.0"
doi: "https://doi.org/10.11578/dc.20191113.1"
description: "Pairwise sequence alignment is one of the most computationally intensive\
  \ kernels in genomic data analysis, accounting for more than 90% of the run time\
  \ for key bioinformatics applications. This method is particularly expensive for\
  \ third-generation sequences due to the high computational expense of analyzing\
  \ these long read lengths (1Kb-1Mb). Given the quadratic overhead of exact pairwise\
  \ algorithms such as Smith-Waterman, for long alignments, the community primarily\
  \ relies on approximate algorithms that search only for high-quality alignments\
  \ and stop early when one is not found. In this work, we present the first GPU optimization\
  \ of the popular X-drop alignment algorithm, named LOGAN. Results show that our\
  \ high-performance multi-GPU LOGAN implementation achieves up to 181.6 GCUPS and\
  \ speed-ups up to 6.6x and 30.7x using 1 and 6 NVIDIA Tesla V100, respectively,\
  \ over the state-of-the-art software running on two IBM Power9 processors using\
  \ 168 threads, with equivalent accuracy. We also demonstrate a 2.3x LOGAN speed-up\
  \ versus ksw2, a state-of-art vectorized algorithm for sequence alignment implemented\
  \ in minimap2. To highlight the impact of our work on a real-world application,\
  \ we couple the LOGAN aligner with a many-to-many long-read alignment software called\
  \ BELLA, and demonstrate that our implementation improves the overall BELLA runtime\
  \ by up to 10.6x. Finally, we adapt the Roofline model for our optimized kernel\
  \ and demonstrate that our implementation is near-optimal on the NVIDIA Tesla V100s."
programming_languages: []
country_of_origin: "United States"
project_keywords: []
licenses:
- "BSD 3-clause \"New\" or \"Revised\" License"
recipient_org: "LBNL"
site_accession_number: "2020-010"
date_record_added: "2019-11-13"
date_record_updated: "2019-11-13"
is_file_certified: false
is_limited: false
links:
- rel: "citation"
  href: "https://www.osti.gov/doecode/biblio/32464"