DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Identifying genomic data use with the Data Citation Explorer

Journal Article · · Scientific Data

Increases in sequencing capacity, combined with rapid accumulation of publications and associated data resources, have increased the complexity of maintaining associations between literature and genomic data. As the volume of literature and data have exceeded the capacity of manual curation, automated approaches to maintaining and confirming associations among these resources have become necessary. Here we present the Data Citation Explorer (DCE), which discovers literature incorporating genomic data that was not formally cited. This service provides advantages over manual curation methods including consistent resource coverage, metadata enrichment, documentation of new use cases, and identification of conflicting metadata. The service reduces labor costs associated with manual review, improves the quality of genome metadata maintained by the U.S. Department of Energy Joint Genome Institute (JGI), and increases the number of known publications that incorporate its data products. The DCE facilitates an understanding of JGI impact, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2476422
Journal Information:
Scientific Data, Journal Name: Scientific Data Journal Issue: 1 Vol. 11; ISSN 2052-4463
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (32)

Data set mentions and citations: A content analysis of full‐text publications journal September 2017
Informal data citation for data sharing and reuse is more common than formal data citation in biomedical fields
  • Park, Hyoungjoo; You, Sukjin; Wolfram, Dietmar
  • Journal of the Association for Information Science and Technology, Vol. 69, Issue 11 https://doi.org/10.1002/asi.24049
journal July 2018
The data‐index: An author‐level metric that values impactful data and incentivizes data sharing journal October 2021
Data Sets Are Foundational to Research. Why Don’t We Cite Them? journal November 2020
KBase: The United States Department of Energy Systems Biology Knowledgebase journal July 2018
The National Microbiome Data Collaborative: enabling microbiome science journal April 2020
Women are credited less in science than men journal June 2022
A data citation roadmap for scholarly data repositories journal April 2019
Evaluating FAIR maturity through a scalable, automated, community-governed framework journal September 2019
Journal Production Guidance for Software and Data Citations journal September 2023
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
An index to quantify an individual's scientific research output journal November 2005
PubMed 2.0 journal October 2020
Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature journal December 2021
PhycoCosm, a comparative algal genomics resource journal October 2020
The Sequence Read Archive: a decade more of explosive growth journal November 2021
Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9 journal November 2022
The IMG/M data management and analysis system v.7: content updates and new features journal November 2022
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide journal January 2006
Phytozome: a comparative platform for green plant genomics journal November 2011
GenBank journal November 2012
The genome portal of the Department of Energy Joint Genome Institute: 2014 updates journal November 2013
MycoCosm portal: gearing up for 1000 fungal genomes journal December 2013
Data Inventories for the Modern Age? Using Data Science to Open Government Data journal April 2022
Dimensions: Bringing down barriers between scientometricians and data journal February 2020
scite: A smart citation index that displays the context of citations and classifies their intent using deep learning journal November 2021
Ten simple rules for getting and giving credit for data journal September 2022
The location of the citation: changing practices in how publications cite original data in the Dryad Digital Repository journal October 2016
Dimensions: Building Context for Search and Evaluation journal August 2018
Source Data for Manuscript: Identifying genomic data use with the Data Citation Explorer dataset January 2024
Bringing Citations and Usage Metrics Together to Make Data Count journal January 2019
How and Why Do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles journal April 2023