DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Unlocking Short Read Sequencing for Metagenomics


We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.

 [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1]
  1. Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Dept. of Civil and Environmental Engineering
Publication Date:
Research Org.:
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
Sponsoring Org.:
OSTI Identifier:
Report Number(s):
Journal ID: ISSN 1932-6203
Grant/Contract Number:  
Resource Type:
Accepted Manuscript
Journal Name:
Additional Journal Information:
Journal Volume: 5; Journal Issue: 7; Journal ID: ISSN 1932-6203
Public Library of Science
Country of Publication:
United States

Citation Formats

Rodrigue, Sébastien, Materna, Arne C., Timberlake, Sonia C., Blackburn, Matthew C., Malmstrom, Rex R., Alm, Eric J., Chisholm, Sallie W., and Gilbert, Jack Anthony. Unlocking Short Read Sequencing for Metagenomics. United States: N. p., 2010. Web. doi:10.1371/journal.pone.0011840.
Rodrigue, Sébastien, Materna, Arne C., Timberlake, Sonia C., Blackburn, Matthew C., Malmstrom, Rex R., Alm, Eric J., Chisholm, Sallie W., & Gilbert, Jack Anthony. Unlocking Short Read Sequencing for Metagenomics. United States.
Rodrigue, Sébastien, Materna, Arne C., Timberlake, Sonia C., Blackburn, Matthew C., Malmstrom, Rex R., Alm, Eric J., Chisholm, Sallie W., and Gilbert, Jack Anthony. Wed . "Unlocking Short Read Sequencing for Metagenomics". United States.
title = {Unlocking Short Read Sequencing for Metagenomics},
author = {Rodrigue, Sébastien and Materna, Arne C. and Timberlake, Sonia C. and Blackburn, Matthew C. and Malmstrom, Rex R. and Alm, Eric J. and Chisholm, Sallie W. and Gilbert, Jack Anthony},
abstractNote = {We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.},
doi = {10.1371/journal.pone.0011840},
journal = {PLoS ONE},
number = 7,
volume = 5,
place = {United States},
year = {2010},
month = {7}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 92 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Next-generation sequencing transforms today's biology
journal, December 2007

Parallel, tag-directed assembly of locally derived short sequence reads
journal, January 2010

  • Hiatt, Joseph B.; Patwardhan, Rupali P.; Turner, Emily H.
  • Nature Methods, Vol. 7, Issue 2
  • DOI: 10.1038/nmeth.1416

The Long March: A Sample Preparation Technique that Enhances Contig Length and Coverage by High-Throughput Short-Read Sequencing
journal, October 2008

A scalable, fully automated process for construction of sequence-ready barcoded libraries for 454
journal, January 2010

Solid-phase reversible immobilization for the isolation of PCR products
journal, January 1995

  • DeAngelis, Margaret M.; Wang, David G.; Hawkins, Trevor L.
  • Nucleic Acids Research, Vol. 23, Issue 22
  • DOI: 10.1093/nar/23.22.4742

DNA purification and isolation using a solid-phase
journal, January 1994

  • Hawkins, Trevor L.; O‘Connor-Morin, Tarra; Roy, Aparna
  • Nucleic Acids Research, Vol. 22, Issue 21
  • DOI: 10.1093/nar/22.21.4543

Magnetic hydrophilic methacrylate-based polymer microspheres for genomic DNA isolation
journal, February 2005

Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities
journal, March 1998

Short clones or long clones? A simulation study on the use of paired reads in metagenomics
journal, January 2010

Widespread known and novel phosphonate utilization pathways in marine bacteria revealed by functional screening and metagenomic analyses
journal, January 2010

Metagenomics: Read Length Matters
journal, January 2008

  • Wommack, K. E.; Bhavsar, J.; Ravel, J.
  • Applied and Environmental Microbiology, Vol. 74, Issue 5
  • DOI: 10.1128/AEM.02181-07

Gene prediction in metagenomic fragments: A large scale machine learning approach
journal, April 2008

  • Hoff, Katharina J.; Tech, Maike; Lingner, Thomas
  • BMC Bioinformatics, Vol. 9, Issue 1
  • DOI: 10.1186/1471-2105-9-217

Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models
journal, August 2009

  • Brady, Arthur; Salzberg, Steven L.
  • Nature Methods, Vol. 6, Issue 9
  • DOI: 10.1038/nmeth.1358

Patterns and Implications of Gene Gain and Loss in the Evolution of Prochlorococcus
journal, January 2007

Whole Genome Amplification and De novo Assembly of Single Bacterial Cells
journal, September 2009

Accurate whole human genome sequencing using reversible terminator chemistry
journal, November 2008

  • Bentley, David R.; Balasubramanian, Shankar; Swerdlow, Harold P.
  • Nature, Vol. 456, Issue 7218
  • DOI: 10.1038/nature07517

Amplification of cDNA ends based on template-switching effect and step- out PCR
journal, March 1999

Regulation of average length of complex PCR product
journal, September 1999

An improved PCR method for walking in uncloned genomic DNA
journal, January 1995

  • Siebert, Paul D.; Chenchik, Alex; Kellogg, David E.
  • Nucleic Acids Research, Vol. 23, Issue 6
  • DOI: 10.1093/nar/23.6.1087

Mapping short DNA sequencing reads and calling variants using mapping quality scores
journal, November 2008

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
journal, September 1997

  • Altschul, Stephen F.; Madden, Thomas L.; Schäffer, Alejandro A.
  • Nucleic Acids Research, Vol. 25, Issue 17, p. 3389-3402
  • DOI: 10.1093/nar/25.17.3389

MEGAN analysis of metagenomic data
journal, February 2007

  • Huson, D. H.; Auch, A. F.; Qi, J.
  • Genome Research, Vol. 17, Issue 3
  • DOI: 10.1101/gr.5969107

Works referencing / citing this record:

Illumina-based analysis of microbial community diversity
journal, June 2011

Ecology of uncultured Prochlorococcus clades revealed through single-cell genomics and biogeographic analysis
journal, August 2012

  • Malmstrom, Rex R.; Rodrigue, Sébastien; Huang, Katherine H.
  • The ISME Journal, Vol. 7, Issue 1
  • DOI: 10.1038/ismej.2012.89

Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill
journal, August 2013

  • Rivers, Adam R.; Sharma, Shalabh; Tringe, Susannah G.
  • The ISME Journal, Vol. 7, Issue 12
  • DOI: 10.1038/ismej.2013.129

Members of the human gut microbiota involved in recovery from Vibrio cholerae infection
journal, September 2014

  • Hsiao, Ansel; Ahmed, A. M. Shamsir; Subramanian, Sathish
  • Nature, Vol. 515, Issue 7527
  • DOI: 10.1038/nature13738

Methylotrophic methanogenic Thermoplasmata implicated in reduced methane emissions from bovine rumen
journal, February 2013

  • Poulsen, Morten; Schwab, Clarissa; Borg Jensen, Bent
  • Nature Communications, Vol. 4, Issue 1
  • DOI: 10.1038/ncomms2432

Next-generation transcriptome assembly
journal, September 2011

  • Martin, Jeffrey A.; Wang, Zhong
  • Nature Reviews Genetics, Vol. 12, Issue 10
  • DOI: 10.1038/nrg3068

Experimental and analytical tools for studying the human microbiome
journal, December 2011

  • Kuczynski, Justin; Lauber, Christian L.; Walters, William A.
  • Nature Reviews Genetics, Vol. 13, Issue 1
  • DOI: 10.1038/nrg3129

The development of colitis in Il10−/− mice is dependent on IL-22
journal, January 2020

  • Gunasekera, Dilini C.; Ma, Jinxia; Vacharathit, Vimvara
  • Mucosal Immunology, Vol. 13, Issue 3
  • DOI: 10.1038/s41385-019-0252-3

Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus
journal, September 2014

  • Biller, Steven J.; Berube, Paul M.; Berta-Thompson, Jessie W.
  • Scientific Data, Vol. 1, Issue 1
  • DOI: 10.1038/sdata.2014.34

Viruses of the Nahant Collection, characterization of 251 marine Vibrionaceae viruses
journal, July 2018

  • Kauffman, Kathryn M.; Brown, Julia M.; Sharma, Radhey S.
  • Scientific Data, Vol. 5, Issue 1
  • DOI: 10.1038/sdata.2018.114

A near complete snapshot of the Zea mays seedling transcriptome revealed from ultra-deep sequencing
journal, March 2014

  • Martin, Jeffrey A.; Johnson, Nicole V.; Gross, Stephen M.
  • Scientific Reports, Vol. 4, Issue 1
  • DOI: 10.1038/srep04519

Mutational landscape of EGFR- , MYC- , and Kras- driven genetically engineered mouse models of lung adenocarcinoma
journal, October 2016

  • McFadden, David G.; Politi, Katerina; Bhutkar, Arjun
  • Proceedings of the National Academy of Sciences, Vol. 113, Issue 42
  • DOI: 10.1073/pnas.1613601113

Apoptotic cleavage of DNA in human lymphocyte chromatin shows high sequence specificity
journal, June 2012

  • Bettecken, Thomas; Frenkel, Zakharia M.; Altmüller, Janine
  • Journal of Biomolecular Structure and Dynamics, Vol. 30, Issue 2
  • DOI: 10.1080/07391102.2012.677772

Current opportunities and challenges in microbial metagenome analysis--a bioinformatic perspective
journal, September 2012

  • Teeling, H.; Glockner, F. O.
  • Briefings in Bioinformatics, Vol. 13, Issue 6
  • DOI: 10.1093/bib/bbs039

FLASH: fast length adjustment of short reads to improve genome assemblies
journal, September 2011

A de novo metagenomic assembly program for shotgun DNA reads
journal, April 2012

COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly
journal, October 2012

PEAR: a fast and accurate Illumina Paired-End reAd mergeR
journal, October 2013

Error filtering, pair assembly and error correction for next-generation sequencing reads
journal, July 2015

Sequence-specific error profile of Illumina sequencers
journal, May 2011

  • Nakamura, Kensuke; Oshima, Taku; Morimoto, Takuya
  • Nucleic Acids Research, Vol. 39, Issue 13
  • DOI: 10.1093/nar/gkr344

Metagenomic 16S rDNA Illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities: Using
journal, September 2013

  • Logares, Ramiro; Sunagawa, Shinichi; Salazar, Guillem
  • Environmental Microbiology, Vol. 16, Issue 9
  • DOI: 10.1111/1462-2920.12250

Mosaic patterns of B-vitamin synthesis and utilization in a natural marine microbial community: B-vitamin mosaics
journal, May 2018

  • Gómez-Consarnau, Laura; Sachdeva, Rohan; Gifford, Scott M.
  • Environmental Microbiology, Vol. 20, Issue 8
  • DOI: 10.1111/1462-2920.14133

Prevention, diagnosis and treatment of high-throughput sequencing data pathologies
journal, March 2014

  • Zhou, Xiaofan; Rokas, Antonis
  • Molecular Ecology, Vol. 23, Issue 7
  • DOI: 10.1111/mec.12680

Bacterial Vesicles in Marine Ecosystems
journal, January 2014

Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus
journal, April 2014

Multispecies diel transcriptional oscillations in open ocean heterotrophic bacterial assemblages
journal, July 2014

H3K9me3-heterochromatin loss at protein-coding genes enables developmental lineage specification
journal, January 2019

Tomatidine Is a Lead Antibiotic Molecule That Targets Staphylococcus aureus ATP Synthase Subunit C
journal, April 2018

  • Lamontagne Boulet, Maxime; Isabelle, Charles; Guay, Isabelle
  • Antimicrobial Agents and Chemotherapy, Vol. 62, Issue 6
  • DOI: 10.1128/aac.02197-17

Complete Genome Sequence of Escherichia coli BW25113
journal, September 2014

Comparative Analysis of Mobilizable Genomic Islands
journal, November 2012

  • Daccord, A.; Ceccarelli, D.; Rodrigue, S.
  • Journal of Bacteriology, Vol. 195, Issue 3
  • DOI: 10.1128/jb.01985-12

Unbiased Parallel Detection of Viral Pathogens in Clinical Samples by Use of a Metagenomic Approach
journal, August 2011

  • Yang, J.; Yang, F.; Ren, L.
  • Journal of Clinical Microbiology, Vol. 49, Issue 10
  • DOI: 10.1128/jcm.00273-11

Natural Bacterial Communities Serve as Quantitative Geochemical Biosensors
journal, May 2015

Inferring the Minimal Genome of Mesoplasma florum by Comparative Genomics and Transposon Mutagenesis
journal, April 2018

Short-read reading-frame predictors are not created equal: sequence error causes loss of signal
journal, July 2012

  • Trimble, William L.; Keegan, Kevin P.; D’Souza, Mark
  • BMC Bioinformatics, Vol. 13, Issue 1
  • DOI: 10.1186/1471-2105-13-183

PANDAseq: paired-end assembler for illumina sequences
journal, January 2012

  • Masella, Andre P.; Bartram, Andrea K.; Truszkowski, Jakub M.
  • BMC Bioinformatics, Vol. 13, Issue 1
  • DOI: 10.1186/1471-2105-13-31

GapFiller: a de novo assembly approach to fill the gap within paired reads
journal, September 2012

Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome
journal, January 2014

  • Li, Wentian; Freudenberg, Jan; Miramontes, Pedro
  • BMC Bioinformatics, Vol. 15, Issue 1
  • DOI: 10.1186/1471-2105-15-2

The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus)
journal, January 2012

The venom-gland transcriptome of the eastern coral snake (Micrurus fulvius) reveals high venom complexity in the intragenomic evolution of venoms
journal, January 2013

The Amazon continuum dataset: quantitative metagenomic and metatranscriptomic inventories of the Amazon River plume, June 2010
journal, January 2014

  • Satinsky, Brandon M.; Zielinski, Brian L.; Doherty, Mary
  • Microbiome, Vol. 2, Issue 1
  • DOI: 10.1186/2049-2618-2-17

WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data
journal, September 2015

  • Farrant, Gregory K.; Hoebeke, Mark; Partensky, Frédéric
  • BMC Bioinformatics, Vol. 16, Issue 1
  • DOI: 10.1186/s12859-015-0705-y

MeFiT: merging and filtering tool for illumina paired-end reads for 16S rRNA amplicon sequencing
journal, December 2016

  • Parikh, Hardik I.; Koparde, Vishal N.; Bradley, Steven P.
  • BMC Bioinformatics, Vol. 17, Issue 1
  • DOI: 10.1186/s12859-016-1358-1

Practical guidelines for B-cell receptor repertoire sequencing analysis
journal, November 2015

First draft genome sequence of a strain belonging to the Zoogloea genus and its gene expression in situ
journal, October 2017

  • Muller, Emilie E. L.; Narayanasamy, Shaman; Zeimes, Myriam
  • Standards in Genomic Sciences, Vol. 12, Issue 1
  • DOI: 10.1186/s40793-017-0274-y

Incorporating 16S Gene Copy Number Information Improves Estimates of Microbial Diversity and Abundance
journal, October 2012

Microbiome Profiling by Illumina Sequencing of Combinatorial Sequence-Tagged PCR Products
journal, October 2010

CREST – Classification Resources for Environmental Sequence Tags
journal, November 2012

Species Identification and Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences
journal, April 2013

Optimizing Information in Next-Generation-Sequencing (NGS) Reads for Improving De Novo Genome Assembly
journal, July 2013

Identification, Characterization, and Diel Pattern of Expression of Canonical Clock Genes in Nephrops norvegicus (Crustacea: Decapoda) Eyestalk
journal, November 2015

Taxonomic and Functional Metagenomic Signature of Turfs in the Abrolhos Reef System (Brazil)
journal, August 2016

Longitudinal microbiome profiling reveals impermanence of probiotic bacteria in domestic pigeons
journal, June 2019

Rapid Whole-Genome Sequencing for Surveillance of Salmonella enterica Serovar Enteritidis
journal, August 2014

  • den Bakker, Henk C.; Allard, Marc W.; Bopp, Dianna
  • Emerging Infectious Diseases, Vol. 20, Issue 8
  • DOI: 10.3201/eid2008.131399

Analysis of plant microbe interactions in the era of next generation sequencing technologies
journal, May 2014