DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

Abstract

Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records from the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cellsmore » was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

Authors:
 [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Energy Efficiency and Renewable Energy (EERE)
OSTI Identifier:
1286862
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
Ecology and Evolution
Additional Journal Information:
Journal Volume: 6; Journal Issue: 14; Journal ID: ISSN 2045-7758
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Data; Database; Biodiversity; Fish; Open-Access

Citation Formats

Troia, Matthew J., and McManamay, Ryan A. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States. United States: N. p., 2016. Web. doi:10.1002/ece3.2225.
Troia, Matthew J., & McManamay, Ryan A. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States. United States. https://doi.org/10.1002/ece3.2225
Troia, Matthew J., and McManamay, Ryan A. Sun . "Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States". United States. https://doi.org/10.1002/ece3.2225. https://www.osti.gov/servlets/purl/1286862.
@article{osti_1286862,
title = {Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States},
author = {Troia, Matthew J. and McManamay, Ryan A.},
abstractNote = {Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records from the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.},
doi = {10.1002/ece3.2225},
journal = {Ecology and Evolution},
number = 14,
volume = 6,
place = {United States},
year = {Sun Jun 12 00:00:00 EDT 2016},
month = {Sun Jun 12 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 63 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Predicting the potential distribution of the alien invasive American bullfrog (Lithobates catesbeianus) in Brazil
journal, August 2007

  • Giovanelli, João G. R.; Haddad, Célio F. B.; Alexandrino, João
  • Biological Invasions, Vol. 10, Issue 5
  • DOI: 10.1007/s10530-007-9154-5

Global priorities for an effective information basis of biodiversity distributions
journal, September 2015

  • Meyer, Carsten; Kreft, Holger; Guralnick, Robert
  • Nature Communications, Vol. 6, Issue 1
  • DOI: 10.1038/ncomms9221

Niche dynamics in space and time
journal, March 2008

  • Pearman, Peter B.; Guisan, Antoine; Broennimann, Olivier
  • Trends in Ecology & Evolution, Vol. 23, Issue 3
  • DOI: 10.1016/j.tree.2007.11.005

Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent
journal, August 2007


Maximum entropy modeling of species geographic distributions
journal, January 2006


Database records as a surrogate for sampling effort provide higher species richness estimations
journal, January 2008


An Index of Cumulative Disturbance to River Fish Habitats of the Conterminous United States from Landscape Anthropogenic Activities
journal, March 2011

  • Esselman, P. C.; Infante, D. M.; Wang, L.
  • Ecological Restoration, Vol. 29, Issue 1-2
  • DOI: 10.3368/er.29.1-2.133

Searching for a Predictive Model for Species Richness of Iberian Dung Beetle Based on Spatial and Environmental Variables
journal, February 2002


Challenges and Opportunities of Open Data in Ecology
journal, February 2011


The big questions for biodiversity informatics
journal, June 2010

  • Peterson, A. Townsend; Knapp, Sandra; Guralnick, Robert
  • Systematics and Biodiversity, Vol. 8, Issue 2
  • DOI: 10.1080/14772001003739369

Homogenization of regional river dynamics by dams and global biodiversity implications
journal, March 2007

  • Poff, N. L.; Olden, J. D.; Merritt, D. M.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 14
  • DOI: 10.1073/pnas.0609812104

Completeness of digital accessible knowledge of the plants of Brazil and priorities for survey and inventory
journal, October 2013

  • Sousa-Baena, Mariane Silveira; Garcia, Letícia Couto; Peterson, Andrew Townsend
  • Diversity and Distributions, Vol. 20, Issue 4
  • DOI: 10.1111/ddi.12136

Geographical sampling bias in a large distributional database and its effects on species richness-environment models
journal, April 2013

  • Yang, Wenjing; Ma, Keping; Kreft, Holger
  • Journal of Biogeography, Vol. 40, Issue 8
  • DOI: 10.1111/jbi.12108

Targeting squares for survey: predicting species richness and incidence of species for a butterfly atlas
journal, November 1999


Population trends of European common birds are predicted by characteristics of their climatic niche
journal, February 2010


Applied Climate-Change Analysis: The Climate Wizard Tool
journal, December 2009


Global diversity patterns of freshwater fishes - potential victims of their own success
journal, October 2014

  • Pelayo-Villamil, Patricia; Guisande, Cástor; Vari, Richard P.
  • Diversity and Distributions, Vol. 21, Issue 3
  • DOI: 10.1111/ddi.12271

Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity
journal, December 2015

  • Hortal, Joaquín; de Bello, Francesco; Diniz-Filho, José Alexandre F.
  • Annual Review of Ecology, Evolution, and Systematics, Vol. 46, Issue 1
  • DOI: 10.1146/annurev-ecolsys-112414-054400

Non-native fishes and native species diversity in freshwater fish assemblages across the United States
journal, September 2008


Historical bias in biodiversity inventories affects the observed environmental niche of the species
journal, June 2008


Mapping species distributions: living with uncertainty
journal, April 2013

  • Ladle, Richard; Hortal, Joaquín
  • Frontiers of Biogeography, Vol. 5, Issue 1
  • DOI: 10.21425/F55112942

Very high resolution interpolated climate surfaces for global land areas
journal, January 2005

  • Hijmans, Robert J.; Cameron, Susan E.; Parra, Juan L.
  • International Journal of Climatology, Vol. 25, Issue 15
  • DOI: 10.1002/joc.1276

How Global Is the Global Biodiversity Information Facility?
journal, November 2007


Biodiversity informatics: managing and applying primary biodiversity data
journal, April 2004

  • Soberón, Jorge; Peterson, Townsend
  • Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, Vol. 359, Issue 1444
  • DOI: 10.1098/rstb.2003.1439

Assessing the accuracy of species distribution models to predict amphibian species richness patterns
journal, January 2009


Evidence of climatic niche shift during biological invasion
journal, August 2007


A Systematic Analysis of Factors Affecting the Performance of Climatic Envelope Models
journal, June 2003


Limitations of Biodiversity Databases: Case Study on Seed-Plant Diversity in Tenerife, Canary Islands
journal, June 2007


Online solutions and the ‘Wallacean shortfall’: what does GBIF contribute to our knowledge of species' ranges?
journal, March 2013

  • Beck, Jan; Ballesteros-Mejia, Liliana; Nagel, Peter
  • Diversity and Distributions, Vol. 19, Issue 8
  • DOI: 10.1111/ddi.12083

Interoperability of Biodiversity Databases: Biodiversity Information on Every Desktop
journal, September 2000


Can niche-based distribution models outperform spatial interpolation?
journal, November 2007


Can we derive macroecological patterns from primary Global Biodiversity Information Facility data?: Macroecological patterns and GBIF data
journal, December 2014

  • García-Roselló, Emilio; Guisande, Cástor; Manjarrés-Hernández, Ana
  • Global Ecology and Biogeography, Vol. 24, Issue 3
  • DOI: 10.1111/geb.12260

Overcoming the Linnean shortfall: Data deficiency and biological survey priorities
journal, December 2010


Predicting species distribution: offering more than simple habitat models
journal, September 2005


Species Density of North American Amphibians and Reptiles
journal, June 1971

  • Kiester, A. Ross
  • Systematic Zoology, Vol. 20, Issue 2
  • DOI: 10.2307/2412053

Diversity of wild Palms (Arecaceae) in the Republic of Benin: Finding the gaps in the National Inventory Combining Field and Digital Accessible Knowledge
journal, September 2015

  • Idohou, Rodrigue; Arino, Arturo; Assogbadjo, Achille
  • Biodiversity Informatics, Vol. 10, Issue 2
  • DOI: 10.17161/bi.v10i2.4914

Estimating the Population Size for Capture-Recapture Data with Unequal Catchability
journal, December 1987


Does Niche Conservatism Promote Speciation? a case Study in North American Salamanders
journal, January 2006

  • Kozak, Kenneth H.; Wiens, John J.
  • Evolution, Vol. 60, Issue 12
  • DOI: 10.1554/06-334.1

Human Impact on Erodable Phosphorus and Eutrophication: A Global Perspective
journal, January 2001


Evaluating the performance of species richness estimators: sensitivity to sample grain size
journal, January 2006


Taxonomist survey biases and the unveiling of biodiversity patterns
journal, February 2009


Diversity, Distribution, and Conservation Status of the Native Freshwater Fishes of the Southern United States
journal, October 2000


Uncertainty associated with survey design in Species Distribution Models
journal, June 2014

  • Tessarolo, Geiziane; Rangel, Thiago F.; Araújo, Miguel B.
  • Diversity and Distributions, Vol. 20, Issue 11
  • DOI: 10.1111/ddi.12236

Effects of sample size on the performance of species distribution models
journal, September 2008


Using Ecological-Niche Modeling as a Conservation Tool for Freshwater Species: Live-Bearing Fishes in Central Mexico
journal, December 2006


Detecting range shifts from historical species occurrences: new perspectives on old data
journal, November 2009


Effect of Roadside bias on the Accuracy of Predictive maps Produced by Bioclimatic Models
journal, April 2004

  • Kadmon, Ronen; Farber, Oren; Danin, Avinoam
  • Ecological Applications, Vol. 14, Issue 2
  • DOI: 10.1890/02-5364

Global effects of land use on local terrestrial biodiversity
journal, April 2015

  • Newbold, Tim; Hudson, Lawrence N.; Hill, Samantha L. L.
  • Nature, Vol. 520, Issue 7545
  • DOI: 10.1038/nature14324

Global diversity and distribution of macrofungi
journal, October 2006

  • Mueller, Gregory M.; Schmit, John P.; Leacock, Patrick R.
  • Biodiversity and Conservation, Vol. 16, Issue 1
  • DOI: 10.1007/s10531-006-9108-8

Conservation Biogeography: assessment and prospect: Conservation Biogeography
journal, January 2005


Life history theory predicts fish assemblage response to hydrologic regimes
journal, January 2012


Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size
journal, December 2012


Survey-gap analysis in expeditionary research: where do we go from here?: SURVEY-GAP ANALYSIS IN EXPEDITIONARY RESEARCH
journal, July 2005


Taxonomic and functional homogenization of an endemic desert fish fauna: Homogenization of an endemic fish fauna
journal, August 2011


An ED-based Protocol for Optimal Sampling of Biodiversity
journal, November 2005


Homogenization of Fish Faunas Across the United States
journal, May 2000


Assessing completeness of biodiversity databases at different spatial scales
journal, February 2007


Does Niche Conservatism Promote Speciation? a case Study in North American Salamanders
journal, December 2006


Predicting species distributions from herbarium collections: does climate bias in collection sampling influence model outcomes?
journal, September 2007


Ecology of North American Freshwater Fishes
book, June 2019


Niche dynamics in space and time
journal, March 2008

  • Pearman, Peter B.; Guisan, Antoine; Broennimann, Olivier
  • Trends in Ecology & Evolution, Vol. 23, Issue 3
  • DOI: 10.1016/j.tree.2007.11.005

Detecting range shifts from historical species occurrences: new perspectives on old data
journal, November 2009


Homogenous rivers, homogenous faunas
journal, March 2007

  • Moyle, P. B.; Mount, J. F.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 14
  • DOI: 10.1073/pnas.0701457104

Evaluating the performance of species richness estimators: sensitivity to sample grain size
journal, January 2006


Assessing the accuracy of species distribution models to predict amphibian species richness patterns
journal, January 2009


Using Ecological-Niche Modeling as a Conservation Tool for Freshwater Species: Live-Bearing Fishes in Central Mexico
journal, December 2006


Interoperability of Biodiversity Databases: Biodiversity Information on Every Desktop
journal, September 2000


How Global Is the Global Biodiversity Information Facility?
journal, November 2007


Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size
journal, December 2012


Estimating the Population Size for Capture-Recapture Data with Unequal Catchability
journal, December 1987


Works referencing / citing this record:

If you build it, they will go: A case study of stream fish diversity loss in an urbanizing riverscape
journal, January 2019

  • Perkin, Joshuah S.; Murphy, Shannon P.; Murray, Christopher M.
  • Aquatic Conservation: Marine and Freshwater Ecosystems, Vol. 29, Issue 4
  • DOI: 10.1002/aqc.3090

Completeness of national freshwater fish species inventories around the world
journal, September 2018

  • Pelayo-Villamil, Patricia; Guisande, Cástor; Manjarrés-Hernández, Ana
  • Biodiversity and Conservation, Vol. 27, Issue 14
  • DOI: 10.1007/s10531-018-1630-y

A global test of ecoregions
journal, November 2018

  • Smith, Jeffrey R.; Letten, Andrew D.; Ke, Po-Ju
  • Nature Ecology & Evolution, Vol. 2, Issue 12
  • DOI: 10.1038/s41559-018-0709-x

Taxonomic bias in biodiversity data and societal preferences
journal, August 2017


A journey through time: exploring temporal patterns amongst digitized plant specimens from Australia
journal, August 2018

  • Haque, MD. Mohasinul; Nipperess, David A.; Baumgartner, John B.
  • Systematics and Biodiversity, Vol. 16, Issue 6
  • DOI: 10.1080/14772000.2018.1472674

Evaluating the data quality of iNaturalist termite records
posted_content, December 2019

  • Hochmair, Hartwig H.; Scheffrahn, Rudolf H.; Basille, Mathieu
  • DOI: 10.1101/863688

Completeness and coverage of open-access freshwater fish distribution data in the United States
journal, September 2017

  • Troia, Matthew J.; McManamay, Ryan A.
  • Diversity and Distributions, Vol. 23, Issue 12
  • DOI: 10.1111/ddi.12637

Assessing the vulnerability of freshwater crayfish to climate change
journal, August 2018

  • Hossain, Md Anwar; Lahoz-Monfort, José J.; Burgman, Mark A.
  • Diversity and Distributions, Vol. 24, Issue 12
  • DOI: 10.1111/ddi.12831

Assessing the impacts of uncertainty in climate‐change vulnerability assessments
journal, May 2019

  • Hossain, Md Anwar; Kujala, Heini; Bland, Lucie M.
  • Diversity and Distributions
  • DOI: 10.1111/ddi.12936

Biogeographic classification of streams using fish community– and trait–environment relationships
journal, October 2019

  • Troia, Matthew J.; McManamay, Ryan A.
  • Diversity and Distributions, Vol. 26, Issue 1
  • DOI: 10.1111/ddi.13001

Survey completeness of a global citizen‐science database of bird occurrence
journal, October 2019

  • Sorte, Frank A. La; Somveille, Marius
  • Ecography, Vol. 43, Issue 1
  • DOI: 10.1111/ecog.04632

Assessing sampling coverage of species distribution in biodiversity databases
journal, June 2019

  • Sporbert, Maria; Bruelheide, Helge; Seidler, Gunnar
  • Journal of Vegetation Science, Vol. 30, Issue 4
  • DOI: 10.1111/jvs.12763

Participatory approaches and open data on venomous snakes: A neglected opportunity in the global snakebite crisis?
journal, March 2018

  • Geneviève, Lester Darryl; Ray, Nicolas; Chappuis, François
  • PLOS Neglected Tropical Diseases, Vol. 12, Issue 3
  • DOI: 10.1371/journal.pntd.0006162

Completeness of Digital Accessible Knowledge (DAK) about terrestrial mammals in the Iberian Peninsula
journal, March 2019


A journey through time: exploring temporal patterns amongst digitized plant specimens from Australia
text, January 2018


Evaluating the data quality of iNaturalist termite records
journal, May 2020


Participatory approaches and open data on venomous snakes: A neglected opportunity in the global snakebite crisis?
text, January 2018

  • Geneviève, Lester Darryl; Ray, Nicolas; Chappuis, Francois
  • Public Library of Science (PLoS)
  • DOI: 10.5167/uzh-201428

A journey through time: exploring temporal patterns amongst digitized plant specimens from Australia
text, January 2018


Biodiversity data supports research on human infectious diseases: Global trends, challenges, and opportunities
journal, June 2023


Taxonomic bias in biodiversity data and societal preferences
journal, August 2017


Participatory approaches and open data on venomous snakes: A neglected opportunity in the global snakebite crisis?
journal, March 2018

  • Geneviève, Lester Darryl; Ray, Nicolas; Chappuis, François
  • PLOS Neglected Tropical Diseases, Vol. 12, Issue 3
  • DOI: 10.1371/journal.pntd.0006162

Completeness of Digital Accessible Knowledge (DAK) about terrestrial mammals in the Iberian Peninsula
journal, March 2019