EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts
Abstract
Abstract Summary Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share amore »
- Authors:
- Publication Date:
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1868101
- Alternate Identifier(s):
- OSTI ID: 1861895
- Grant/Contract Number:
- KP160101; 4000150817; KP160101 and 4000150817
- Resource Type:
- Published Article
- Journal Name:
- Bioinformatics
- Additional Journal Information:
- Journal Name: Bioinformatics Journal Volume: 38 Journal Issue: 10; Journal ID: ISSN 1367-4803
- Publisher:
- Oxford University Press
- Country of Publication:
- United Kingdom
- Language:
- English
Citation Formats
Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., and Alkan, ed., Can. EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts. United Kingdom: N. p., 2022.
Web. doi:10.1093/bioinformatics/btac176.
Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., & Alkan, ed., Can. EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts. United Kingdom. https://doi.org/10.1093/bioinformatics/btac176
Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., and Alkan, ed., Can. Thu .
"EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts". United Kingdom. https://doi.org/10.1093/bioinformatics/btac176.
@article{osti_1868101,
title = {EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts},
author = {Lo, Chien-Chi and Shakya, Migun and Connor, Ryan and Davenport, Karen and Flynn, Mark and Gutiérrez, Adán Myers y. and Hu, Bin and Li, Po-E and Jackson, Elais Player and Xu, Yan and Chain, Patrick S. G. and Alkan, ed., Can},
abstractNote = {Abstract Summary Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share a perspective on patterns and impact observed with Illumina versus ONT technologies on workflow congruence and differences. Availability and implementation https://edge-covid19.edgebioinformatics.org, and https://github.com/LANL-Bioinformatics/EDGE/tree/SARS-CoV2. Supplementary information Supplementary data are available at Bioinformatics online.},
doi = {10.1093/bioinformatics/btac176},
journal = {Bioinformatics},
number = 10,
volume = 38,
place = {United Kingdom},
year = {Thu Mar 24 00:00:00 EDT 2022},
month = {Thu Mar 24 00:00:00 EDT 2022}
}
https://doi.org/10.1093/bioinformatics/btac176
Works referenced in this record:
Assignment of Epidemiological Lineages in an Emerging Pandemic Using the Pangolin Tool
journal, July 2021
- O’Toole, Áine; Scher, Emily; Underwood, Anthony
- Virus Evolution
Ready-to-use public infrastructure for global SARS-CoV-2 monitoring
journal, September 2021
- Maier, Wolfgang; Bray, Simon; van den Beek, Marius
- Nature Biotechnology, Vol. 39, Issue 10
From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy
journal, July 2018
- Rang, Franka J.; Kloosterman, Wigard P.; de Ridder, Jeroen
- Genome Biology, Vol. 19, Issue 1
GenBank
journal, November 2015
- Clark, Karen; Karsch-Mizrachi, Ilene; Lipman, David J.
- Nucleic Acids Research, Vol. 44, Issue D1
Mapping and phasing of structural variation in patient genomes using nanopore sequencing
journal, November 2017
- Cretu Stancu, Mircea; van Roosmalen, Markus J.; Renkens, Ivo
- Nature Communications, Vol. 8, Issue 1
JBrowse: a dynamic web platform for genome visualization and analysis
journal, April 2016
- Buels, Robert; Yao, Eric; Diesh, Colin M.
- Genome Biology, Vol. 17, Issue 1
Rapid evaluation and quality control of next generation sequencing data with FaQCs
journal, November 2014
- Lo, Chien-Chi; Chain, Patrick S. G.
- BMC Bioinformatics, Vol. 15, Issue 1
GISAID: Global initiative on sharing all influenza data – from vision to reality
journal, March 2017
- Shu, Yuelong; McCauley, John
- Eurosurveillance, Vol. 22, Issue 13
Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform
journal, November 2016
- Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J.
- Nucleic Acids Research, Vol. 45, Issue 1
Minimap2: pairwise alignment for nucleotide sequences
journal, May 2018
- Li, Heng
- Bioinformatics, Vol. 34, Issue 18
Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic
journal, May 2021
- Turakhia, Yatish; Thornlow, Bryan; Hinrichs, Angie S.
- Nature Genetics, Vol. 53, Issue 6