A work stealing based approach for enabling scalable optimal sequence homology detection

Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram; Vishnu, Abhinav

doi:10.1016/j.jpdc.2014.08.009

Title: A work stealing based approach for enabling scalable optimal sequence homology detection

Journal Article · Fri May 01 00:00:00 EDT 2015 · Journal of Parallel and Distributed Computing

DOI:https://doi.org/10.1016/j.jpdc.2014.08.009· OSTI ID:1191793

Daily, Jeffrey A. ^[1]; Kalyanaraman, Anantharaman ^[2]; Krishnamoorthy, Sriram ^[1]; Vishnu, Abhinav ^[1]

Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Washington State Univ., Pullman, WA (United States)

Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. Here, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for 2.56M sequences on up to 8K cores show parallel efficiencies of ~ 75-100%, a time-to-solution of 33s, and a rate of ~ 2.0M alignments per second.

Cite

Export

Save

Research Organization:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1191793

Report Number(s):: PNNL-SA-103338; KJ0402000

Journal Information:: Journal of Parallel and Distributed Computing, Vol. 79-80, Issue C; ISSN 0743-7315

Country of Publication:: United States

Language:: English

Similar Records

Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale

Thesis/Dissertation · Fri May 01 00:00:00 EDT 2015 · OSTI ID:1191793

Daily, Jeffrey A.

Towards Scalable Optimal Sequence Homology Detection

Conference · Wed Dec 26 00:00:00 EST 2012 · OSTI ID:1191793

Daily, Jeffrey A.; Krishnamoorthy, Sriram; Kalyanaraman, Anantharaman

A Scalable Parallel Algorithm for Large-Scale Protein Sequence Homology Detection

Conference · Mon Sep 13 00:00:00 EDT 2010 · OSTI ID:1191793

Wu, Changjun; Kalyanaraman, Anantharaman; Cannon, William R

Related Subjects

sequence homology
homolgy detection
pairwise sequence alignment
protein family identification
dynamic load balancing
work stealing
distributed task counters
parallel suffix tree construction

Title: A work stealing based approach for enabling scalable optimal sequence homology detection

Citation Formats

Similar Records

Related Subjects