Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs
Abstract
Assembly of metagenomic samples is a very complex process, with algorithms designed to address sequencing platform-specific issues, (read length, data volume, and/or community complexity), while also faced with genomes that differ greatly in nucleotide compositional biases and in abundance. To address these issues, we have developed a post-assembly process: MetaGenomic Assembly by Merging (MeGAMerge). We compare this process to the performance of several assemblers, using both real, and in-silico generated samples of different community composition and complexity. MeGAMerge consistently outperforms individual assembly methods, producing larger contigs with an increased number of predicted genes, without replication of data. MeGAMerge contigs are supported by read mapping and contig alignment data, when using synthetically-derived and real metagenomic data, as well as by gene prediction analyses and similarity searches. Ultimately, MeGAMerge is a flexible method that generates improved metagenome assemblies, with the ability to accommodate upcoming sequencing platforms, as well as present and future assembly algorithms.
- Authors:
-
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States); USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
- Publication Date:
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC); U.S. Department of Homeland Security
- OSTI Identifier:
- 1259288
- Grant/Contract Number:
- AC02-05CH11231; HSHQDC08X00790; B104153I; B084531I
- Resource Type:
- Journal Article: Accepted Manuscript
- Journal Name:
- Scientific Reports
- Additional Journal Information:
- Journal Volume: 4; Journal ID: ISSN 2045-2322
- Publisher:
- Nature Publishing Group
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; genome assembly algorithms; genomics; metagenomics; next-generation sequencing
Citation Formats
Scholz, Matthew, Lo, Chien -Chi, and Chain, Patrick S. G. Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs. United States: N. p., 2014.
Web. doi:10.1038/srep06480.
Scholz, Matthew, Lo, Chien -Chi, & Chain, Patrick S. G. Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs. United States. https://doi.org/10.1038/srep06480
Scholz, Matthew, Lo, Chien -Chi, and Chain, Patrick S. G. 2014.
"Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs". United States. https://doi.org/10.1038/srep06480. https://www.osti.gov/servlets/purl/1259288.
@article{osti_1259288,
title = {Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs},
author = {Scholz, Matthew and Lo, Chien -Chi and Chain, Patrick S. G.},
abstractNote = {Assembly of metagenomic samples is a very complex process, with algorithms designed to address sequencing platform-specific issues, (read length, data volume, and/or community complexity), while also faced with genomes that differ greatly in nucleotide compositional biases and in abundance. To address these issues, we have developed a post-assembly process: MetaGenomic Assembly by Merging (MeGAMerge). We compare this process to the performance of several assemblers, using both real, and in-silico generated samples of different community composition and complexity. MeGAMerge consistently outperforms individual assembly methods, producing larger contigs with an increased number of predicted genes, without replication of data. MeGAMerge contigs are supported by read mapping and contig alignment data, when using synthetically-derived and real metagenomic data, as well as by gene prediction analyses and similarity searches. Ultimately, MeGAMerge is a flexible method that generates improved metagenome assemblies, with the ability to accommodate upcoming sequencing platforms, as well as present and future assembly algorithms.},
doi = {10.1038/srep06480},
url = {https://www.osti.gov/biblio/1259288},
journal = {Scientific Reports},
issn = {2045-2322},
number = ,
volume = 4,
place = {United States},
year = {Wed Oct 01 00:00:00 EDT 2014},
month = {Wed Oct 01 00:00:00 EDT 2014}
}
Web of Science
Figures / Tables:
Works referenced in this record:
Next generation sequencing and bioinformatic bottlenecks: the current state of metagenomic data analysis
journal, February 2012
- Scholz, Matthew B.; Lo, Chien-Chi; Chain, Patrick SG
- Current Opinion in Biotechnology, Vol. 23, Issue 1
Assembly algorithms for next-generation sequencing data
journal, June 2010
- Miller, Jason R.; Koren, Sergey; Sutton, Granger
- Genomics, Vol. 95, Issue 6
Assemblathon 1: A competitive assessment of de novo short read assembly methods
journal, September 2011
- Earl, D.; Bradnam, K.; St. John, J.
- Genome Research, Vol. 21, Issue 12
Scaling metagenome sequence assembly with probabilistic de Bruijn graphs
journal, July 2012
- Pell, J.; Hintze, A.; Canino-Koning, R.
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 33
From genomics to metagenomics
journal, February 2012
- Desai, Narayan; Antonopoulos, Dion; Gilbert, Jack A.
- Current Opinion in Biotechnology, Vol. 23, Issue 1
Integrating genome assemblies with MAIA
journal, September 2010
- Nijkamp, J.; Winterbach, W.; van den Broek, M.
- Bioinformatics, Vol. 26, Issue 18
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
journal, February 2008
- Zerbino, D. R.; Birney, E.
- Genome Research, Vol. 18, Issue 5
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
journal, December 2012
- Luo, Ruibang; Liu, Binghang; Xie, Yinlong
- GigaScience, Vol. 1, Issue 1
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
journal, April 2012
- Peng, Y.; Leung, H. C. M.; Yiu, S. M.
- Bioinformatics, Vol. 28, Issue 11
Ray Meta: scalable de novo metagenome assembly and profiling
journal, January 2012
- Boisvert, Sébastien; Raymond, Frédéric; Godzaridis, Élénie
- Genome Biology, Vol. 13, Issue 12
Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill
journal, June 2012
- Mason, Olivia U.; Hazen, Terry C.; Borglin, Sharon
- The ISME Journal, Vol. 6, Issue 9
A novel metatranscriptomic approach to identify gene expression dynamics during extracellular electron transfer
journal, March 2013
- Ishii, Shun’ichi; Suzuki, Shino; Norden-Krichmar, Trina M.
- Nature Communications, Vol. 4, Issue 1
Single-cell and metagenomic analyses indicate a fermentative and saccharolytic lifestyle for members of the OP9 lineage
journal, May 2013
- Dodsworth, Jeremy A.; Blainey, Paul C.; Murugapiran, Senthil K.
- Nature Communications, Vol. 4, Issue 1
Proteogenomic Analysis of a Thermophilic Bacterial Consortium Adapted to Deconstruct Switchgrass
journal, July 2013
- D'haeseleer, Patrik; Gladden, John M.; Allgaier, Martin
- PLoS ONE, Vol. 8, Issue 7, Article No. e68465
De novo assembly of human genomes with massively parallel short read sequencing
journal, December 2009
- Li, R.; Zhu, H.; Ruan, J.
- Genome Research, Vol. 20, Issue 2
Comparative genome assembly
journal, January 2004
- Pop, M.
- Briefings in Bioinformatics, Vol. 5, Issue 3
Minimus: a fast, lightweight genome assembler
journal, January 2007
- Sommer, Daniel D.; Delcher, Arthur L.; Salzberg, Steven L.
- BMC Bioinformatics, Vol. 8, Issue 1
The Sequence Alignment/Map format and SAMtools
journal, June 2009
- Li, H.; Handsaker, B.; Wysoker, A.
- Bioinformatics, Vol. 25, Issue 16
Aligning Short Sequencing Reads with Bowtie
journal, December 2010
- Langmead, Ben
- Current Protocols in Bioinformatics, Vol. 32, Issue 1
Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010
- Hyatt, Doug; Chen, Gwo-Liang; LoCascio, Philip F.
- BMC Bioinformatics, Vol. 11, Issue 1
Gene and translation initiation site prediction in metagenomic sequences
journal, July 2012
- Hyatt, Doug; LoCascio, Philip F.; Hauser, Loren J.
- Bioinformatics, Vol. 28, Issue 17
Mesobacillus aurantius sp. nov., isolated from an orange-colored pond near a solar saltern
journal, January 2021
- Rai, Anusha; Smita, N.; Shabbir, A.
- Archives of Microbiology, Vol. 203, Issue 4
Works referencing / citing this record:
Integrated multi-omics of the human gut microbiome in a case study of familial type 1 diabetes
journal, October 2016
- Heintz-Buschart, Anna; May, Patrick; Laczny, Cédric C.
- Nature Microbiology, Vol. 2, Issue 1
Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
journal, May 2016
- Frank, J. A.; Pan, Y.; Tooming-Klunderud, A.
- Scientific Reports, Vol. 6, Issue 1
Metagenomic investigation of the geologically unique Hellenic Volcanic Arc reveals a distinctive ecosystem with unexpected physiology: Metagenomic investigation of the Hellenic Volcanic Arc
journal, December 2015
- Oulas, Anastasis; Polymenakou, Paraskevi N.; Seshadri, Rekha
- Environmental Microbiology, Vol. 18, Issue 4
Wetland Sediments Host Diverse Microbial Taxa Capable of Cycling Alcohols
journal, April 2019
- Dalcin Martins, Paula; Frank, Jeroen; Mitchell, Hugh
- Applied and Environmental Microbiology, Vol. 85, Issue 12
Patterns in Wetland Microbial Community Composition and Functional Gene Repertoire Associated with Methane Emissions
journal, May 2015
- He, Shaomei; Malfatti, Stephanie A.; McFarland, Jack W.
- mBio, Vol. 6, Issue 3
InteMAP: Integrated metagenomic assembly pipeline for NGS short reads
journal, August 2015
- Lai, Binbin; Wang, Fumeng; Wang, Xiaoqi
- BMC Bioinformatics, Vol. 16, Issue 1
ICoVeR – an interactive visualization tool for verification and refinement of metagenomic bins
journal, May 2017
- Broeksema, Bertjan; Calusinska, Magdalena; McGee, Fintan
- BMC Bioinformatics, Vol. 18, Issue 1
Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community
journal, October 2015
- Bowers, Robert M.; Clum, Alicia; Tice, Hope
- BMC Genomics, Vol. 16, Issue 1
Recovering complete and draft population genomes from metagenome datasets
journal, March 2016
- Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.
- Microbiome, Vol. 4, Issue 1
Viral and metabolic controls on high rates of microbial sulfur and carbon cycling in wetland ecosystems
journal, August 2018
- Dalcin Martins, Paula; Danczak, Robert E.; Roux, Simon
- Microbiome, Vol. 6, Issue 1
Overview of Virus Metagenomic Classification Methods and Their Biological Applications
journal, April 2018
- Nooij, Sam; Schmitz, Dennis; Vennema, Harry
- Frontiers in Microbiology, Vol. 9
Patterns in Wetland Microbial Community Composition and Functional Gene Repertoire Associated with Methane Emissions
journal, May 2015
- He, Shaomei; Malfatti, Stephanie A.; McFarland, Jack W.
- mBio, Vol. 6, Issue 3
InteMAP: Integrated metagenomic assembly pipeline for NGS short reads
journal, August 2015
- Lai, Binbin; Wang, Fumeng; Wang, Xiaoqi
- BMC Bioinformatics, Vol. 16, Issue 1
ICoVeR – an interactive visualization tool for verification and refinement of metagenomic bins
journal, May 2017
- Broeksema, Bertjan; Calusinska, Magdalena; McGee, Fintan
- BMC Bioinformatics, Vol. 18, Issue 1
Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community
journal, October 2015
- Bowers, Robert M.; Clum, Alicia; Tice, Hope
- BMC Genomics, Vol. 16, Issue 1
Recovering complete and draft population genomes from metagenome datasets
journal, March 2016
- Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.
- Microbiome, Vol. 4, Issue 1
Viral and metabolic controls on high rates of microbial sulfur and carbon cycling in wetland ecosystems
journal, August 2018
- Dalcin Martins, Paula; Danczak, Robert E.; Roux, Simon
- Microbiome, Vol. 6, Issue 1
Figures / Tables found in this record: