skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

Abstract

Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records from the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cellsmore » was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

Authors:
 [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Environmental Sciences Division
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Energy Efficiency and Renewable Energy (EERE)
OSTI Identifier:
1286862
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
Ecology and Evolution
Additional Journal Information:
Journal Volume: 6; Journal Issue: 14; Journal ID: ISSN 2045-7758
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Data; Database; Biodiversity; Fish; Open-Access

Citation Formats

Troia, Matthew J., and McManamay, Ryan A. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States. United States: N. p., 2016. Web. doi:10.1002/ece3.2225.
Troia, Matthew J., & McManamay, Ryan A. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States. United States. doi:10.1002/ece3.2225.
Troia, Matthew J., and McManamay, Ryan A. Sun . "Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States". United States. doi:10.1002/ece3.2225. https://www.osti.gov/servlets/purl/1286862.
@article{osti_1286862,
title = {Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States},
author = {Troia, Matthew J. and McManamay, Ryan A.},
abstractNote = {Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records from the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.},
doi = {10.1002/ece3.2225},
journal = {Ecology and Evolution},
number = 14,
volume = 6,
place = {United States},
year = {2016},
month = {6}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Can niche-based distribution models outperform spatial interpolation?
journal, November 2007


Online solutions and the ‘Wallacean shortfall’: what does GBIF contribute to our knowledge of species' ranges?
journal, March 2013

  • Beck, Jan; Ballesteros-Mejia, Liliana; Nagel, Peter
  • Diversity and Distributions, Vol. 19, Issue 8
  • DOI: 10.1111/ddi.12083

Human Impact on Erodable Phosphorus and Eutrophication: A Global Perspective
journal, January 2001


Overcoming the Linnean shortfall: Data deficiency and biological survey priorities
journal, December 2010


Evidence of climatic niche shift during biological invasion
journal, August 2007


Estimating the Population Size for Capture-Recapture Data with Unequal Catchability
journal, December 1987


Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size
journal, December 2012


Targeting squares for survey: predicting species richness and incidence of species for a butterfly atlas
journal, November 1999


Using Ecological-Niche Modeling as a Conservation Tool for Freshwater Species: Live-Bearing Fishes in Central Mexico
journal, December 2006


Interoperability of Biodiversity Databases: Biodiversity Information on Every Desktop
journal, September 2000


An Index of Cumulative Disturbance to River Fish Habitats of the Conterminous United States from Landscape Anthropogenic Activities
journal, March 2011

  • Esselman, P. C.; Infante, D. M.; Wang, L.
  • Ecological Restoration, Vol. 29, Issue 1-2
  • DOI: 10.3368/er.29.1-2.133

Survey-gap analysis in expeditionary research: where do we go from here?: SURVEY-GAP ANALYSIS IN EXPEDITIONARY RESEARCH
journal, July 2005


Can we derive macroecological patterns from primary Global Biodiversity Information Facility data?: Macroecological patterns and GBIF data
journal, December 2014

  • García-Roselló, Emilio; Guisande, Cástor; Manjarrés-Hernández, Ana
  • Global Ecology and Biogeography, Vol. 24, Issue 3
  • DOI: 10.1111/geb.12260

Predicting the potential distribution of the alien invasive American bullfrog (Lithobates catesbeianus) in Brazil
journal, August 2007

  • Giovanelli, João G. R.; Haddad, Célio F. B.; Alexandrino, João
  • Biological Invasions, Vol. 10, Issue 5
  • DOI: 10.1007/s10530-007-9154-5

Applied Climate-Change Analysis: The Climate Wizard Tool
journal, December 2009


Predicting species distribution: offering more than simple habitat models
journal, September 2005


Very high resolution interpolated climate surfaces for global land areas
journal, January 2005

  • Hijmans, Robert J.; Cameron, Susan E.; Parra, Juan L.
  • International Journal of Climatology, Vol. 25, Issue 15
  • DOI: 10.1002/joc.1276

An ED-based Protocol for Optimal Sampling of Biodiversity
journal, November 2005


Evaluating the performance of species richness estimators: sensitivity to sample grain size
journal, January 2006


Limitations of Biodiversity Databases: Case Study on Seed-Plant Diversity in Tenerife, Canary Islands
journal, June 2007


Historical bias in biodiversity inventories affects the observed environmental niche of the species
journal, June 2008


Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity
journal, December 2015

  • Hortal, Joaquín; de Bello, Francesco; Diniz-Filho, José Alexandre F.
  • Annual Review of Ecology, Evolution, and Systematics, Vol. 46, Issue 1
  • DOI: 10.1146/annurev-ecolsys-112414-054400

Diversity of wild Palms (Arecaceae) in the Republic of Benin: Finding the gaps in the National Inventory Combining Field and Digital Accessible Knowledge
journal, September 2015

  • Idohou, Rodrigue; Arino, Arturo; Assogbadjo, Achille
  • Biodiversity Informatics, Vol. 10, Issue 2
  • DOI: 10.17161/bi.v10i2.4914

Population trends of European common birds are predicted by characteristics of their climatic niche
journal, February 2010


A Systematic Analysis of Factors Affecting the Performance of Climatic Envelope Models
journal, June 2003


Effect of Roadside bias on the Accuracy of Predictive maps Produced by Bioclimatic Models
journal, April 2004

  • Kadmon, Ronen; Farber, Oren; Danin, Avinoam
  • Ecological Applications, Vol. 14, Issue 2
  • DOI: 10.1890/02-5364

Species Density of North American Amphibians and Reptiles
journal, June 1971

  • Kiester, A. Ross
  • Systematic Zoology, Vol. 20, Issue 2
  • DOI: 10.2307/2412053

Does Niche Conservatism Promote Speciation? a case Study in North American Salamanders
journal, January 2006

  • Kozak, Kenneth H.; Wiens, John J.
  • Evolution, Vol. 60, Issue 12
  • DOI: 10.1554/06-334.1

Mapping species distributions: living with uncertainty
journal, April 2013

  • Ladle, Richard; Hortal, Joaquín
  • Frontiers of Biogeography, Vol. 5, Issue 1
  • DOI: 10.21425/F55112942

Database records as a surrogate for sampling effort provide higher species richness estimations
journal, January 2008


Searching for a Predictive Model for Species Richness of Iberian Dung Beetle Based on Spatial and Environmental Variables
journal, February 2002


Global priorities for an effective information basis of biodiversity distributions
journal, September 2015

  • Meyer, Carsten; Kreft, Holger; Guralnick, Robert
  • Nature Communications, Vol. 6, Issue 1
  • DOI: 10.1038/ncomms9221

Life history theory predicts fish assemblage response to hydrologic regimes
journal, January 2012


Non-native fishes and native species diversity in freshwater fish assemblages across the United States
journal, September 2008


Global diversity and distribution of macrofungi
journal, October 2006

  • Mueller, Gregory M.; Schmit, John P.; Leacock, Patrick R.
  • Biodiversity and Conservation, Vol. 16, Issue 1
  • DOI: 10.1007/s10531-006-9108-8

Global effects of land use on local terrestrial biodiversity
journal, April 2015

  • Newbold, Tim; Hudson, Lawrence N.; Hill, Samantha L. L.
  • Nature, Vol. 520, Issue 7545
  • DOI: 10.1038/nature14324

Niche dynamics in space and time
journal, March 2008

  • Pearman, Peter B.; Guisan, Antoine; Broennimann, Olivier
  • Trends in Ecology & Evolution, Vol. 23, Issue 3
  • DOI: 10.1016/j.tree.2007.11.005

Global diversity patterns of freshwater fishes - potential victims of their own success
journal, October 2014

  • Pelayo-Villamil, Patricia; Guisande, Cástor; Vari, Richard P.
  • Diversity and Distributions, Vol. 21, Issue 3
  • DOI: 10.1111/ddi.12271

Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent
journal, August 2007


The big questions for biodiversity informatics
journal, June 2010

  • Peterson, A. Townsend; Knapp, Sandra; Guralnick, Robert
  • Systematics and Biodiversity, Vol. 8, Issue 2
  • DOI: 10.1080/14772001003739369

Maximum entropy modeling of species geographic distributions
journal, January 2006


Assessing the accuracy of species distribution models to predict amphibian species richness patterns
journal, January 2009


Homogenization of regional river dynamics by dams and global biodiversity implications
journal, March 2007

  • Poff, N. L.; Olden, J. D.; Merritt, D. M.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 14
  • DOI: 10.1073/pnas.0609812104

Taxonomic and functional homogenization of an endemic desert fish fauna: Homogenization of an endemic fish fauna
journal, August 2011


Homogenization of Fish Faunas Across the United States
journal, May 2000


Challenges and Opportunities of Open Data in Ecology
journal, February 2011


Taxonomist survey biases and the unveiling of biodiversity patterns
journal, February 2009


Biodiversity informatics: managing and applying primary biodiversity data
journal, April 2004

  • Soberón, Jorge; Peterson, Townsend
  • Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, Vol. 359, Issue 1444
  • DOI: 10.1098/rstb.2003.1439

Assessing completeness of biodiversity databases at different spatial scales
journal, February 2007


Completeness of digital accessible knowledge of the plants of Brazil and priorities for survey and inventory
journal, October 2013

  • Sousa-Baena, Mariane Silveira; Garcia, Letícia Couto; Peterson, Andrew Townsend
  • Diversity and Distributions, Vol. 20, Issue 4
  • DOI: 10.1111/ddi.12136

Uncertainty associated with survey design in Species Distribution Models
journal, June 2014

  • Tessarolo, Geiziane; Rangel, Thiago F.; Araújo, Miguel B.
  • Diversity and Distributions, Vol. 20, Issue 11
  • DOI: 10.1111/ddi.12236

Detecting range shifts from historical species occurrences: new perspectives on old data
journal, November 2009


Diversity, Distribution, and Conservation Status of the Native Freshwater Fishes of the Southern United States
journal, October 2000


Conservation Biogeography: assessment and prospect: Conservation Biogeography
journal, January 2005


Effects of sample size on the performance of species distribution models
journal, September 2008


Geographical sampling bias in a large distributional database and its effects on species richness-environment models
journal, April 2013

  • Yang, Wenjing; Ma, Keping; Kreft, Holger
  • Journal of Biogeography, Vol. 40, Issue 8
  • DOI: 10.1111/jbi.12108

How Global Is the Global Biodiversity Information Facility?
journal, November 2007


    Works referencing / citing this record:

    Completeness of Digital Accessible Knowledge (DAK) about terrestrial mammals in the Iberian Peninsula
    journal, March 2019


    A global test of ecoregions
    journal, November 2018

    • Smith, Jeffrey R.; Letten, Andrew D.; Ke, Po-Ju
    • Nature Ecology & Evolution, Vol. 2, Issue 12
    • DOI: 10.1038/s41559-018-0709-x

    Survey completeness of a global citizen‐science database of bird occurrence
    journal, October 2019

    • Sorte, Frank A. La; Somveille, Marius
    • Ecography, Vol. 43, Issue 1
    • DOI: 10.1111/ecog.04632

    A global test of ecoregions
    journal, November 2018

    • Smith, Jeffrey R.; Letten, Andrew D.; Ke, Po-Ju
    • Nature Ecology & Evolution, Vol. 2, Issue 12
    • DOI: 10.1038/s41559-018-0709-x

    Survey completeness of a global citizen‐science database of bird occurrence
    journal, October 2019

    • Sorte, Frank A. La; Somveille, Marius
    • Ecography, Vol. 43, Issue 1
    • DOI: 10.1111/ecog.04632

    Completeness of Digital Accessible Knowledge (DAK) about terrestrial mammals in the Iberian Peninsula
    journal, March 2019


    Evaluating the data quality of iNaturalist termite records
    journal, May 2020