DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Assessment of BOLD and GenBank – Their accuracy and reliability for the identification of biological materials

Journal Article · · PLoS ONE
ORCiD logo [1];  [1];  [1]
  1. Federal Bureau of Investigation (FBI) Lab. Division, Quantico, VA (United States). Counterterrorism and Forensic Science Research Unit

Taxonomic identification of biological materials can be achieved through DNA barcoding, where an unknown “barcode” sequence is compared to a reference database. In many disciplines, obtaining accurate taxonomic identifications can be imperative (e.g., evolutionary biology, food regulatory compliance, forensics). The Barcode of Life DataSystems (BOLD) and GenBank are the main public repositories of DNA barcode sequences. In this study, an assessment of the accuracy and reliability of sequences in these databases was performed. To achieve this, 1) curated reference materials for plants, macro-fungi and insects were obtained from national collections, 2) relevant barcode sequences (rbcL, matK, trnH-psbA, ITS and COI) from these reference samples were generated and used for searching against both databases, and 3) optimal search parameters were determined that ensure the best match to the known species in either database. While GenBank outperformed BOLD for species-level identification of insect taxa (53% and 35%, respectively), both databases performed comparably for plants and macro-fungi (~81% and ~57%, respectively). Results illustrated that using a multi-locus barcode approach increased identification success. This study outlines the utility of the BLAST search tool in GenBank and the BOLD identification engine for taxonomic identifications and identifies some precautions needed when using public sequence repositories in applied scientific disciplines.

Research Organization:
Federal Bureau of Investigation (FBI) Lab. Division, Quantico, VA (United States). Counterterrorism and Forensic Science Research Unit
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
SC0014664
OSTI ID:
1627883
Journal Information:
PLoS ONE, Vol. 14, Issue 6; ISSN 1932-6203
Publisher:
Public Library of ScienceCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 116 works
Citation information provided by
Web of Science

References (71)

Using COI barcodes to identify forensically and medically important blowflies journal March 2007
Application of DNA-Based Methods in Forensic Entomology journal January 2008
DNA barcoding and morphological analysis for rapid identification of most economically important crop-infesting Sunn pests belonging to Eurygaster Laporte, 1833 (Hemiptera, Scutelleridae) journal October 2017
ITS1 versus ITS2 as DNA metabarcodes for fungi journal January 2013
Automated high throughput animal CO1 metabarcode classification journal March 2018
At Least 1 in 20 16S rRNA Sequence Records Currently Held in Public Repositories Is Estimated To Contain Substantial Anomalies journal December 2005
Utility of GenBank and the Barcode of Life Data Systems (BOLD) for the identification of forensically important Diptera from Belgium and France journal December 2013
An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity journal November 2008
Evaluating the feasibility of using candidate DNA barcodes in discriminating species of the large Asteraceae family journal October 2010
Protax -fungi: a web-based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences journal July 2018
Plant DNA barcoding: from gene to genome: Plant identification using DNA barcodes journal March 2014
BARCODING: bold: The Barcode of Life Data System (http://www.barcodinglife.org): BARCODING journal January 2007
Plant dna barcoding system for forensic application journal December 2017
A DNA barcode for land plants journal July 2009
Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants journal April 2017
GenBank journal November 2017
On the unreliability of published DNA sequences journal August 2003
Taxonomic misidentification in public DNA databases: Commentary journal September 2003
Morphological and genetic relationship of two closely-related giant water bugs: Appasus japonicus  Vuillefroy and Appasus major  Esaki (Heteroptera: Belostomatidae): Relationships of Appasus journal September 2013
Identifying North American freshwater invertebrates using DNA barcodes: are existing COI sequence libraries fit for purpose? journal March 2018
Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi journal March 2012
Intragenomic variation in the ITS rDNA region obscures phylogenetic relationships and inflates estimates of operational taxonomic units in genus Laetiporus journal July 2011
Over 2.5 million COI sequences in GenBank and growing journal September 2018
From Genus to Phylum: Large-Subunit and Internal Transcribed Spacer rRNA Operon Regions Show Similar Classification Accuracies Influenced by Database Composition journal November 2013
Combining NCBI and BOLD databases for OTU assignment in metabarcoding and metagenomic datasets: The BOLD_NCBI _Merger journal December 2017
Does GenBank provide a reliable DNA barcode reference to identify small alien oysters invading the Mediterranean Sea? journal August 2014
The Fungi: 1, 2, 3 … 5.1 million species? journal March 2011
Barcoding Beetles: A Regional Survey of 1872 Species Reveals High Identification Success and Unusually Deep Interspecific Divergences journal September 2014
Progress in molecular and morphological taxon discovery in Fungi and options for formal classification of environmental sequences journal March 2011
Taxonomic Reliability of DNA Sequences in Public Sequence Databases: A Fungal Perspective journal December 2006
Biological identifications through DNA barcodes
  • Hebert, Paul D. N.; Cywinska, Alina; Ball, Shelley L.
  • Proceedings of the Royal Society of London. Series B: Biological Sciences, Vol. 270, Issue 1512 https://doi.org/10.1098/rspb.2002.2218
journal February 2003
Assessment of candidate plant DNA barcodes using the Rutaceae family journal June 2010
Where are all the undocumented fungal species? A study of Mortierella demonstrates the need for sequence-based classification: Commentary journal July 2011
DNA barcoding allows identification of European Fanniidae (Diptera) of forensic interest journal September 2017
Preserving Accuracy in GenBank journal March 2008
Validation of the ITS2 Region as a Novel DNA Barcode for Identifying Medicinal Plant Species journal January 2010
Intraspecific ITS Variability in the Kingdom Fungi as Expressed in the International Sequence Databases and Its Implications for Molecular Species Identification journal January 2008
Testing DNA Barcode Performance in 1000 Species of European Lepidoptera: Large Geographic Distances Have Small Genetic Impacts journal December 2014
Fungal Molecular Systematics journal November 1991
Validation of the barcoding gene COI for use in forensic genetic species identification journal November 2007
A Regional Survey of 1872 Species Reveals High Identification Success and Unusually Deep Interspecific Divergences dataset January 2014
S3 Table. Specimen information and barcode results for macro-fungi. dataset January 2020
S4 Table: Specimen information and barcode results for plants. dataset January 2020
S2 Table: Specimen information and barcode results for insects dataset January 2020
S3 Table. Specimen information and barcode results for macro-fungi. dataset January 2020
S4 Table: Specimen information and barcode results for plants. dataset January 2020
S1 File. BOLD and GenBank output for samples used in statistical analyses (insects, n=17; macro-fungi, n=14; plants 2-loci, n=53; plants 4-loci, n=28). dataset January 2020
DNA methylation in ELOVL2 and C1orf132 correctly predicted chronological age of individuals from three disease groups journal July 2017
Assessment of candidate plant DNA barcodes using the Rutaceae family journal June 2010
Progress in molecular and morphological taxon discovery in Fungi and options for formal classification of environmental sequences journal March 2011
Validation of the barcoding gene COI for use in forensic genetic species identification journal November 2007
DNA barcoding allows identification of European Fanniidae (Diptera) of forensic interest journal September 2017
Automated high throughput animal CO1 metabarcode classification journal March 2018
Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants journal April 2017
Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi journal March 2012
GenBank journal November 2017
Biological identifications through DNA barcodes
  • Hebert, Paul D. N.; Cywinska, Alina; Ball, Shelley L.
  • Proceedings of the Royal Society of London. Series B: Biological Sciences, Vol. 270, Issue 1512 https://doi.org/10.1098/rspb.2002.2218
journal February 2003
Plant DNA barcoding: from gene to genome: Plant identification using DNA barcodes journal March 2014
Using COI barcodes to identify forensically and medically important blowflies journal March 2007
An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity journal November 2008
Where are all the undocumented fungal species? A study of Mortierella demonstrates the need for sequence-based classification: Commentary journal July 2011
Protax -fungi: a web-based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences journal July 2018
Application of DNA-Based Methods in Forensic Entomology journal January 2008
Evaluating the feasibility of using candidate DNA barcodes in discriminating species of the large Asteraceae family journal October 2010
Taxonomic Reliability of DNA Sequences in Public Sequence Databases: A Fungal Perspective journal December 2006
Validation of the ITS2 Region as a Novel DNA Barcode for Identifying Medicinal Plant Species journal January 2010
Barcoding Beetles: A Regional Survey of 1872 Species Reveals High Identification Success and Unusually Deep Interspecific Divergences journal September 2014
Testing DNA Barcode Performance in 1000 Species of European Lepidoptera: Large Geographic Distances Have Small Genetic Impacts journal December 2014
Combining NCBI and BOLD databases for OTU assignment in metabarcoding and metagenomic datasets: The BOLD_NCBI _Merger journal December 2017
Utility of GenBank and the Barcode of Life Data Systems (BOLD) for the identification of forensically important Diptera from Belgium and France journal December 2013
S1 Table. Primers and thermal cycling conditions used to amplify each barcoding region for insects, macro-fungi, and plants. dataset January 2020

Cited By (3)