DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts

Abstract

Abstract Summary Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share amore » perspective on patterns and impact observed with Illumina versus ONT technologies on workflow congruence and differences. Availability and implementation https://edge-covid19.edgebioinformatics.org, and https://github.com/LANL-Bioinformatics/EDGE/tree/SARS-CoV2. Supplementary information Supplementary data are available at Bioinformatics online.« less

Authors:
ORCiD logo; ORCiD logo; ; ; ; ORCiD logo; ORCiD logo; ; ; ; ;
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1868101
Alternate Identifier(s):
OSTI ID: 1861895
Grant/Contract Number:  
KP160101; 4000150817; KP160101 and 4000150817
Resource Type:
Published Article
Journal Name:
Bioinformatics
Additional Journal Information:
Journal Name: Bioinformatics Journal Volume: 38 Journal Issue: 10; Journal ID: ISSN 1367-4803
Publisher:
Oxford University Press
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., and Alkan, ed., Can. EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts. United Kingdom: N. p., 2022. Web. doi:10.1093/bioinformatics/btac176.
Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., & Alkan, ed., Can. EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts. United Kingdom. https://doi.org/10.1093/bioinformatics/btac176
Lo, Chien-Chi, Shakya, Migun, Connor, Ryan, Davenport, Karen, Flynn, Mark, Gutiérrez, Adán Myers y., Hu, Bin, Li, Po-E, Jackson, Elais Player, Xu, Yan, Chain, Patrick S. G., and Alkan, ed., Can. Thu . "EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts". United Kingdom. https://doi.org/10.1093/bioinformatics/btac176.
@article{osti_1868101,
title = {EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts},
author = {Lo, Chien-Chi and Shakya, Migun and Connor, Ryan and Davenport, Karen and Flynn, Mark and Gutiérrez, Adán Myers y. and Hu, Bin and Li, Po-E and Jackson, Elais Player and Xu, Yan and Chain, Patrick S. G. and Alkan, ed., Can},
abstractNote = {Abstract Summary Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share a perspective on patterns and impact observed with Illumina versus ONT technologies on workflow congruence and differences. Availability and implementation https://edge-covid19.edgebioinformatics.org, and https://github.com/LANL-Bioinformatics/EDGE/tree/SARS-CoV2. Supplementary information Supplementary data are available at Bioinformatics online.},
doi = {10.1093/bioinformatics/btac176},
journal = {Bioinformatics},
number = 10,
volume = 38,
place = {United Kingdom},
year = {Thu Mar 24 00:00:00 EDT 2022},
month = {Thu Mar 24 00:00:00 EDT 2022}
}

Works referenced in this record:

Assignment of Epidemiological Lineages in an Emerging Pandemic Using the Pangolin Tool
journal, July 2021

  • O’Toole, Áine; Scher, Emily; Underwood, Anthony
  • Virus Evolution
  • DOI: 10.1093/ve/veab064

Ready-to-use public infrastructure for global SARS-CoV-2 monitoring
journal, September 2021


From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy
journal, July 2018


GenBank
journal, November 2015

  • Clark, Karen; Karsch-Mizrachi, Ilene; Lipman, David J.
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1276

Mapping and phasing of structural variation in patient genomes using nanopore sequencing
journal, November 2017

  • Cretu Stancu, Mircea; van Roosmalen, Markus J.; Renkens, Ivo
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/s41467-017-01343-4

JBrowse: a dynamic web platform for genome visualization and analysis
journal, April 2016


Rapid evaluation and quality control of next generation sequencing data with FaQCs
journal, November 2014


GISAID: Global initiative on sharing all influenza data – from vision to reality
journal, March 2017


Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform
journal, November 2016

  • Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J.
  • Nucleic Acids Research, Vol. 45, Issue 1
  • DOI: 10.1093/nar/gkw1027

Minimap2: pairwise alignment for nucleotide sequences
journal, May 2018


Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic
journal, May 2021