skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Extreme-Scale De Novo Genome Assembly

Abstract

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

Authors:
 [1];  [2];  [3];  [2];  [2];  [3];  [2]
  1. Intel Corporation, Santa Clara, CA (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1398520
Grant/Contract Number:  
[AC02-05CH11231]
Resource Type:
Accepted Manuscript
Resource Relation:
[Related Information: Chapter in Exascale Scientific Applications: Programming Approaches for Scalability, Performance, and Portability, Straatsma, Antypas, Williams (eds.)]
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; 59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Georganas, Evangelos, Hofmeyr, Steven, Egan, Rob, Buluc, Aydin, Oliker, Leonid, Rokhsar, Daniel, and Yelick, Katherine. Extreme-Scale De Novo Genome Assembly. United States: N. p., 2017. Web. doi:10.1201/b21930.
Georganas, Evangelos, Hofmeyr, Steven, Egan, Rob, Buluc, Aydin, Oliker, Leonid, Rokhsar, Daniel, & Yelick, Katherine. Extreme-Scale De Novo Genome Assembly. United States. doi:10.1201/b21930.
Georganas, Evangelos, Hofmeyr, Steven, Egan, Rob, Buluc, Aydin, Oliker, Leonid, Rokhsar, Daniel, and Yelick, Katherine. Tue . "Extreme-Scale De Novo Genome Assembly". United States. doi:10.1201/b21930. https://www.osti.gov/servlets/purl/1398520.
@article{osti_1398520,
title = {Extreme-Scale De Novo Genome Assembly},
author = {Georganas, Evangelos and Hofmeyr, Steven and Egan, Rob and Buluc, Aydin and Oliker, Leonid and Rokhsar, Daniel and Yelick, Katherine},
abstractNote = {De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.},
doi = {10.1201/b21930},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2017},
month = {9}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: