DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Broom: application for non-redundant storage of high throughput sequencing data

Abstract

Abstract Motivation The data generation capabilities of high throughput sequencing (HTS) instruments have exponentially increased over the last few years, while the cost of sequencing has dramatically decreased allowing this technology to become widely used in biomedical studies. For small labs and individual researchers, however, storage and transfer of large amounts of HTS data present a significant challenge. The recent trends in increased sequencing quality and genome coverage can be used to reconsider HTS data storage strategies. Results We present Broom, a stand-alone application designed to select and store only high-quality sequencing reads at extremely high compression rates. Written in C++, the application accepts single and paired-end reads in FASTQ and FASTA formats and decompresses data in FASTA format. Availability and implementation C++ code available at https://scsb.utmb.edu/labgroups/fofanov/broom.asp. Supplementary information Supplementary data are available at Bioinformatics online.

Authors:
; ; ; ;
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1487297
Resource Type:
Published Article
Journal Name:
Bioinformatics
Additional Journal Information:
Journal Name: Bioinformatics Journal Volume: 35 Journal Issue: 1; Journal ID: ISSN 1367-4803
Publisher:
Oxford University Press
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Albayrak, Levent, Khanipov, Kamil, Golovko, George, Fofanov, Yuriy, and Valencia, ed., Alfonso. Broom: application for non-redundant storage of high throughput sequencing data. United Kingdom: N. p., 2018. Web. doi:10.1093/bioinformatics/bty580.
Albayrak, Levent, Khanipov, Kamil, Golovko, George, Fofanov, Yuriy, & Valencia, ed., Alfonso. Broom: application for non-redundant storage of high throughput sequencing data. United Kingdom. https://doi.org/10.1093/bioinformatics/bty580
Albayrak, Levent, Khanipov, Kamil, Golovko, George, Fofanov, Yuriy, and Valencia, ed., Alfonso. Fri . "Broom: application for non-redundant storage of high throughput sequencing data". United Kingdom. https://doi.org/10.1093/bioinformatics/bty580.
@article{osti_1487297,
title = {Broom: application for non-redundant storage of high throughput sequencing data},
author = {Albayrak, Levent and Khanipov, Kamil and Golovko, George and Fofanov, Yuriy and Valencia, ed., Alfonso},
abstractNote = {Abstract Motivation The data generation capabilities of high throughput sequencing (HTS) instruments have exponentially increased over the last few years, while the cost of sequencing has dramatically decreased allowing this technology to become widely used in biomedical studies. For small labs and individual researchers, however, storage and transfer of large amounts of HTS data present a significant challenge. The recent trends in increased sequencing quality and genome coverage can be used to reconsider HTS data storage strategies. Results We present Broom, a stand-alone application designed to select and store only high-quality sequencing reads at extremely high compression rates. Written in C++, the application accepts single and paired-end reads in FASTQ and FASTA formats and decompresses data in FASTA format. Availability and implementation C++ code available at https://scsb.utmb.edu/labgroups/fofanov/broom.asp. Supplementary information Supplementary data are available at Bioinformatics online.},
doi = {10.1093/bioinformatics/bty580},
journal = {Bioinformatics},
number = 1,
volume = 35,
place = {United Kingdom},
year = {2018},
month = {7}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1093/bioinformatics/bty580

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

The MPEG video compression algorithm
journal, April 1992


The Sequence Read Archive
journal, November 2010

  • Leinonen, R.; Sugawara, H.; Shumway, M.
  • Nucleic Acids Research, Vol. 39, Issue Database
  • DOI: 10.1093/nar/gkq1019

Compression of next-generation sequencing reads aided by highly efficient de novo assembly
journal, August 2012

  • Jones, Daniel C.; Ruzzo, Walter L.; Peng, Xinxia
  • Nucleic Acids Research, Vol. 40, Issue 22
  • DOI: 10.1093/nar/gks754