skip to main content

DOE PAGESDOE PAGES

Title: Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.
Authors:
 [1] ;  [1] ;  [2] ;  [1] ;  [2] ;  [3] ;  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  2. North Carolina State Univ., Raleigh, NC (United States)
  3. LanzaTech, Skokie, IL (United States)
Publication Date:
Grant/Contract Number:
AC05-00OR22725
Type:
Accepted Manuscript
Journal Name:
Scientific Data
Additional Journal Information:
Journal Volume: 2; Journal ID: ISSN 2052-4463
Publisher:
Nature Publishing Group
Research Org:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). BioEnergy Science Center (BESC)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES
OSTI Identifier:
1185931

Utturkar, Sagar M., Klingeman, Dawn Marie, Bruno-Barcena, José M., Chinn, Mari S., Grunden, Amy, Köpke, Michael, and Brown, Steven D.. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies. United States: N. p., Web. doi:10.1038/sdata.2015.14.
Utturkar, Sagar M., Klingeman, Dawn Marie, Bruno-Barcena, José M., Chinn, Mari S., Grunden, Amy, Köpke, Michael, & Brown, Steven D.. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies. United States. doi:10.1038/sdata.2015.14.
Utturkar, Sagar M., Klingeman, Dawn Marie, Bruno-Barcena, José M., Chinn, Mari S., Grunden, Amy, Köpke, Michael, and Brown, Steven D.. 2015. "Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies". United States. doi:10.1038/sdata.2015.14. https://www.osti.gov/servlets/purl/1185931.
@article{osti_1185931,
title = {Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies},
author = {Utturkar, Sagar M. and Klingeman, Dawn Marie and Bruno-Barcena, José M. and Chinn, Mari S. and Grunden, Amy and Köpke, Michael and Brown, Steven D.},
abstractNote = {During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.},
doi = {10.1038/sdata.2015.14},
journal = {Scientific Data},
number = ,
volume = 2,
place = {United States},
year = {2015},
month = {4}
}

Works referenced in this record:

Clostridium ljungdahlii represents a microbial production platform based on syngas
journal, July 2010
  • Kopke, M.; Held, C.; Hujer, S.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 29, p. 13087-13092
  • DOI: 10.1073/pnas.1004716107

A tale of three next generation sequencing platforms: comparison of Ion torrent, pacific biosciences and illumina MiSeq sequencers
journal, January 2012

Continuous base identification for single-molecule nanopore DNA sequencing
journal, February 2009
  • Clarke, James; Wu, Hai-Chen; Jayasinghe, Lakmal
  • Nature Nanotechnology, Vol. 4, Issue 4, p. 265-270
  • DOI: 10.1038/nnano.2009.12