An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

Taylor, Ronald C

doi:10.1186/1471-2105-11-S12-S1

Title: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

Journal Article · Tue Dec 21 00:00:00 EST 2010 · BMC Bioinformatics, 11(Suppl 12):S1

DOI:https://doi.org/10.1186/1471-2105-11-S12-S1· OSTI ID:1019222

Taylor, Ronald C

Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

Cite

Export

Save

Research Organization:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1019222

Report Number(s):: PNNL-SA-74925; KP1601030; TRN: US1103576

Journal Information:: BMC Bioinformatics, 11(Suppl 12):S1, Vol. 11, Issue 12

Country of Publication:: United States

Language:: English

Similar Records

A case study of tuning MapReduce for efficient Bioinformatics in the cloud

Journal Article · Thu Oct 06 00:00:00 EDT 2016 · Parallel Computing · OSTI ID:1019222

Shi, Lizhen; Wang, Zhong; Yu, Weikuan; +1 more

Analyzing petabytes of data with Hadoop

Multimedia · Fri Aug 21 00:00:00 EDT 2009 · OSTI ID:1019222

Hammerbacher, Jeff

MARIANE: MApReduce Implementation Adapted for HPC Environments

Conference · Wed Jul 06 00:00:00 EDT 2011 · OSTI ID:1019222

Fadika, Zacharia; Dede, Elif; Govindaraju, Madhusudhan; +1 more

Related Subjects

43 PARTICLE ACCELERATORS
ISOCHRONOUS CYCLOTRONS
PROGRAMMING
COMPUTERS
COMPUTER CODES

Title: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

Citation Formats

Similar Records

Related Subjects