Extreme-Scale De Novo Genome Assembly
- Intel Corporation, Santa Clara, CA (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1398520
- Country of Publication:
- United States
- Language:
- English
Similar Records
Parallel String Graph Construction and Transitive Reduction for De Novo Genome Assembly
Extreme Scale De Novo Metagenome Assembly
Performance Characterization of De Novo Genome Assembly on Leading Parallel Systems.
Journal Article
·
Fri Apr 30 20:00:00 EDT 2021
· Proceedings - IEEE International Parallel and Distributed Processing Symposium (IPDPS)
·
OSTI ID:1818231
Extreme Scale De Novo Metagenome Assembly
Conference
·
Thu Mar 14 00:00:00 EDT 2019
·
OSTI ID:1581597
Performance Characterization of De Novo Genome Assembly on Leading Parallel Systems.
Conference
·
Tue Aug 01 00:00:00 EDT 2017
· Lecture Notes in Computer Science, vol 10417. Springer, Cham
·
OSTI ID:1567514