skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluation and validation of de novo and hybrid assembly techniques to derive high quality genome sequences

Journal Article · · Bioinformatics
 [1];  [2];  [2];  [3];  [3];  [3];  [3]
  1. Univ. of Tennessee, Knoxville, TN (United States). Graduate School of Genome Science and Technology
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)/ Biosciences Division
  3. Univ. of Tennessee, Knoxville, TN (United States). Graduate School of Genome Science and Technology; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)/ Biosciences Division

Our motivation with this work was to assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. Our results show Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. As to availability and implementation–all assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1149743
Journal Information:
Bioinformatics, Vol. 30, Issue 19; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 61 works
Citation information provided by
Web of Science

References (54)

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing journal May 2012
A hybrid approach for the automated finishing of bacterial genomes journal July 2012
Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia journal January 2014
Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides journal October 2012
ALLPATHS: De novo assembly of whole-genome shotgun microreads journal February 2008
Genome Project Standards in a New Era of Sequencing journal October 2009
Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory journal September 2012
Informed and automated k-mer size selection for genome assembly journal June 2013
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data journal May 2013
Assembling Genomic DNA Sequences with PHRAP journal March 2007
Assemblathon 1: A competitive assessment of de novo short read assembly methods journal September 2011
Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology journal November 2012
The Value of Complete Microbial Genome Sequencing (You Get What You Pay For) journal December 2002
QUAST: quality assessment tool for genome assemblies journal February 2013
A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes journal September 2011
REAPR: a universal tool for genome assembly evaluation journal January 2013
Sequencing Intractable DNA to Close Microbial Genomes journal July 2012
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
Hybrid error correction and de novo assembly of single-molecule sequencing reads journal July 2012
Reducing assembly complexity of microbial genomes with single-molecule sequencing journal January 2013
Automated ensemble assembly and validation of microbial genomes journal May 2014
RNAmmer: consistent and rapid annotation of ribosomal RNA genes journal April 2007
Comparison of Next-Generation Sequencing Systems journal January 2012
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler journal December 2012
ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads journal January 2009
IMG: the integrated microbial genomes database and comparative analysis system journal December 2011
The Fast Changing Landscape of Sequencing Technologies and Their Impact on Microbial Genome Assemblies and Annotation journal December 2012
Aggressive assembly of pyrosequencing reads with mates journal October 2008
Assembly algorithms for next-generation sequencing data journal June 2010
Finishing genomes with limited resources: lessons from an ensemble of microbial genomes journal January 2010
Sequence assembly demystified journal January 2013
Efficient and accurate whole genome assembly and methylome profiling of E. coli journal January 2013
A tale of three next generation sequencing platforms: comparison of Ion torrent, pacific biosciences and illumina MiSeq sequencers journal January 2012
CGAL: computing genome assembly likelihoods journal January 2013
Finished bacterial genomes from shotgun sequence data journal July 2012
The advantages of SMRT sequencing journal June 2013
GAGE: A critical evaluation of genome assemblies and assembly algorithms journal January 2012
Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes journal July 2013
ABySS: A parallel assembler for short read sequence data journal February 2009
Minimus: a fast, lightweight genome assembler journal January 2007
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs journal June 2012
Repetitive DNA and next-generation sequencing: computational challenges and solutions journal November 2011
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs journal February 2008
The MaSuRCA genome assembler journal August 2013
Mesobacillus aurantius sp. nov., isolated from an orange-colored pond near a solar saltern journal January 2021
Mutations in virus-derived small RNAs journal June 2020
Automated ensemble assembly and validation of microbial genomes journal February 2014
The advantages of SMRT sequencing journal July 2013
Disk Compression of k-mer Sets text January 2020
Charting the genomic landscape of seed-free plants text January 2021
Automated ensemble assembly and validation of microbial genomes text January 2014
Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species text January 2013
Informed and Automated k-Mer Size Selection for Genome Assembly preprint January 2013
Draft Genome Sequence of Rhizobium sp. Strain PDO1-076, a Bacterium Isolated from Populus deltoides journal April 2012

Cited By (39)

Genomics-informed isolation and characterization of a symbiotic Nanoarchaeota system from a terrestrial geothermal environment journal July 2016
Rapid, multiplexed, whole genome and plasmid sequencing of foodborne pathogens using long-read nanopore technology journal November 2019
Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies journal April 2015
Comparative genomics uncovers the prolific and distinctive metabolic potential of the cyanobacterial genus Moorea journal March 2017
Nuclear and mitochondrial genomic resources for the meltwater stonefly (Plecoptera: Nemouridae), Lednia tumana (Ricker, 1952) journal August 2019
hybridSPAdes: an algorithm for hybrid assembly of short and long reads journal November 2015
riboSeed: leveraging prokaryotic genomic architecture to assemble across ribosomal regions journal March 2018
Nuclear and mitochondrial genomic resources for the meltwater stonefly, Lednia tumana Ricker, 1952 (Plecoptera: Nemouridae) posted_content April 2019
SNP genotyping and population genomics from expressed sequences - current advances and future possibilities journal April 2015
Complete Genome Sequence of Pelosinus fermentans JBW45, a Member of a Remarkably Competitive Group of Negativicutes in the Firmicutes Phylum journal September 2015
Draft Genome Sequence of Thalassotalea sp. Strain ND16A Isolated from Eastern Mediterranean Sea Water Collected from a Depth of 1,055 Meters journal November 2014
Linked read technology for assembling large complex and polyploid genomes journal September 2018
Ultraplexing: increasing the efficiency of long-read sequencing for hybrid assembly with k-mer-based multiplexing journal March 2020
Species-level resolution of 16S rRNA gene amplicons sequenced through the MinION™ portable nanopore sequencer journal January 2016
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies journal July 2017
PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Burkholderia pseudomallei Two-Chromosome Genome journal August 2017
Two Poplar-Associated Bacterial Isolates Induce Additive Favorable Responses in a Constructed Plant-Microbiome System journal April 2016
De Novo Sequencing and Hybrid Assembly of the Biofuel Crop Jatropha curcas L.: Identification of Quantitative Trait Loci for Geminivirus Resistance journal January 2019
Root bacterial endophytes alter plant phenotype, but not physiology journal January 2016
Improving ancient DNA genome assembly journal January 2017
Improving ancient DNA genome assembly posted_content August 2016
Improving ancient DNA genome assembly posted_content August 2016
riboSeed: leveraging prokaryotic genomic architecture to assemble across ribosomal regions posted_content February 2018
Assessing the performance of the Oxford Nanopore Technologies MinION journal March 2015
Oxford Nanopore MinION Sequencing and Genome Assembly journal October 2016
Completing bacterial genome assemblies: strategy and performance comparisons journal March 2015
BAC-pool sequencing and analysis confirms growth-associated QTLs in the Asian seabass genome journal November 2016
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism journal April 2017
Species-level resolution of 16S rRNA gene amplicons sequenced through the MinION TM portable nanopore sequencer journal January 2016
Draft Genome Sequence of Pyrodictium occultum PL19 T , a Marine Hyperthermophilic Species of Archaea That Grows Optimally at 105°C journal February 2016
Draft Genome Sequence of Burkholderia sp. MR1, a Methylarsenate-Reducing Bacterial Isolate from Florida Golf Course Soil journal June 2015
Draft Genome Sequences of Four Streptomyces Isolates from the Populus trichocarpa Root Endosphere and Rhizosphere journal December 2015
Transcriptomic Analysis of the Anterior Silk Gland in the Domestic Silkworm (Bombyx mori) – Insight into the Mechanism of Silk Formation and Spinning journal September 2015
Evaluation and Validation of Assembling Corrected PacBio Long Reads for Microbial Genome Completion via Hybrid Approaches journal December 2015
In Silico Whole Genome Sequencer and Analyzer (iWGS): a Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies journal September 2016
Colwellia psychrerythraea Strains from Distant Deep Sea Basins Show Adaptation to Local Conditions journal May 2016
Metabolic functions of Pseudomonas fluorescens strains from Populus deltoides depend on rhizosphere or endosphere isolation compartment journal October 2015
Coagulase-Negative Staphylococci Pathogenomics journal March 2019
Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine journal April 2016

Figures / Tables (6)