DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture

Journal Article · · Database

The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE
Contributing Organization:
AgBioData consortium
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1471253
Alternate ID(s):
OSTI ID: 1490698
Journal Information:
Database, Vol. 2018; ISSN 1758-0463
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 30 works
Citation information provided by
Web of Science

References (124)

Uberon, an integrative multi-species anatomy ontology journal January 2012
Araport: the Arabidopsis Information Portal journal November 2014
Attitudes and norms affecting scientists’ data reuse journal December 2017
The Arabidopsis Information Resource (TAIR): gene structure and function annotation journal December 2007
Data Archiving journal March 2010
MaizeGDB update: new tools, data and interface for the maize model organism database journal October 2015
Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration journal October 2016
Animal QTLdb: an improved database tool for livestock animal QTL/association data dissemination in the post-genome era journal November 2012
Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species journal October 2017
Ensembl 2018 journal November 2017
Reactome graph database: Efficient access to complex pathway data journal January 2018
Sustainable funding for biocuration: The Arabidopsis Information Resource (TAIR) as a case study of a subscription-based funding model journal January 2016
YeastMine--an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit. text January 2012
MouseMine: a new data warehouse for MGI journal June 2015
Using the Arabidopsis Information Resource (TAIR) to Find Information About Arabidopsis Genes : Using The Arabidopsis Information Resource (TAIR) journal December 2017
Canto: an online tool for community literature curation journal February 2014
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update journal May 2016
Complete genomes in WWW Entrez: data representation and analysis journal July 1999
Our path to better science in less time using open data science tools journal May 2017
A review of genomic data warehousing systems journal May 2013
A Chado case study: an ontology-based modular schema for representing genome-associated biological information journal July 2007
Developmental progress and current status of the Animal QTLdb journal November 2015
The FAIR Guiding Principles for scientific data management and stewardship. other January 2016
Bovine Genome Database: new tools for gleaning function from the Bos taurus genome journal October 2015
The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses journal December 2012
Ensembl 2017 journal November 2016
The Ensembl genome database project journal January 2002
Expansion of the Gene Ontology knowledgebase and resources journal November 2016
Logical Development of the Cell Ontology journal January 2011
Towards recommendations for metadata and data handling in plant phenotyping journal June 2015
InterMOD: integrated data and tools for the unification of model organism research journal May 2013
FlyMine: an integrated database for Drosophila and Anopheles genomics journal January 2007
Expression Atlas: gene and protein expression across multiple studies and organisms journal November 2017
Using AberOWL for fast and scalable reasoning over BioPortal ontologies journalarticle January 2016
The UCSC Genome Browser Database journal January 2003
Expansion of the Gene Ontology knowledgebase and resources text January 2017
PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data journal January 2003
Data Archiving journal February 2010
InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data journal September 2012
AgroPortal: A vocabulary and ontology repository for agronomy journal January 2018
Plant Reactome: a resource for plant pathways and comparative analysis journal October 2016
MTGD: The Medicago truncatula Genome Database journal November 2014
The Planteome database: an integrated resource for reference ontologies, plant genomics and phenomics journal November 2017
Outreach and online training services at the Saccharomyces Genome Database journal January 2017
Analysis of disease-associated objects at the Rat Genome Database journal January 2013
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
Ten Simple Rules for a Successful Collaboration journal January 2007
Tripal: a construction toolkit for online genome databases journal January 2011
The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant journal January 2001
Review%3A Interoperability standards text January 2016
Re-thinking organisms: The impact of databases on model organism biology
  • Leonelli, Sabina; Ankeny, Rachel A.
  • Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, Vol. 43, Issue 1 https://doi.org/10.1016/j.shpsc.2011.10.003
journal March 2012
The BioMart community portal: an innovative alternative to large, centralized data repositories journal April 2015
The Ontology Lookup Service: bigger and better journal May 2010
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants journal June 2010
Evolution of biomedical ontologies and mappings: Overview of recent approaches journal January 2016
WormBase 2014: new views of curated biology journal November 2013
Digital Object Identifiers for scientific data journal January 2005
Complete genomes in WWW Entrez: data representation and analysis journal July 1999
Reactome graph database: Efficient access to complex pathway data journal January 2018
The 26th annual Nucleic Acids Research database issue and Molecular Biology Database Collection journal December 2018
The Plant Ontology: A Tool for Plant Genomics book January 2016
Open data: curation is under-resourced journal October 2016
Canto: an online tool for community literature curation. text January 2014
The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences journal January 2016
Digital Object Identifiers for scientific data journal January 2005
The Triticeae Toolbox: Combining Phenotype and Genotype Data to Advance Small-Grains Breeding journal January 2016
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine journal November 2015
Identifiers for the 21st century : How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. text January 2017
Measures for interoperability of phenotypic data: minimum information requirements and formatting journal November 2016
FlyMine: an integrated database for Drosophila and Anopheles genomics. text January 2007
The BioMart community portal: an innovative alternative to large, centralized data repositories. text January 2015
State of the Dublin Core Metadata Initiative, April 2003 journal April 2003
Data Archiving book January 2019
Unmet Needs for Analyzing Biological Big Data: A Survey of 704 NSF Principal Investigators preprint February 2017
Crowdsourcing in biomedicine: challenges and opportunities journal April 2015
Using AberOWL for fast and scalable reasoning over BioPortal ontologies journal August 2016
Germinate 3: Development of a Common Platform to Support the Distribution of Experimental Data on Crop Wild Relatives journal January 2017
MouseMine: a new data warehouse for MGI journal June 2015
Ten Simple Rules for a Successful Collaboration journal January 2007
Tripal: a construction toolkit for online genome databases journal January 2011
The Sol Genomics Network (SGN)—from genotype to phenotype to breeding journal November 2014
BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata journal December 2011
Unmet needs for analyzing biological big data: A survey of 704 NSF principal investigators journal October 2017
The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration journal November 2007
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update journal May 2016
XenMine: A genomic interaction tool for the Xenopus community journal June 2017
Toward interoperable bioscience data journal January 2012
The Ontology Lookup Service: bigger and better journal May 2010
The arabidopsis information resource: Making and mining the “gold standard” annotated reference plant genome: Tair: Making and Mining the “Gold Standard” Plant Genome journal August 2015
The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions journal December 2000
How to capture developmental brain dynamics: gaps and solutions journal May 2021
TreeGenes: A Forest Tree Genome Database journal January 2008
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection journal December 2017
Data archiving journal April 2011
Gramene 2018: unifying comparative genomics and pathway resources for plant research journal November 2017
InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data journal September 2012
The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification journal October 2014
Using AberOWL for fast and scalable reasoning over BioPortal ontologies journalarticle January 2016
Biocuration at the Saccharomyces genome database : Biocuration at SGD journal July 2015
ZFIN, The zebrafish model organism database: Updates and new directions: zfin updates and new directions journal July 2015
The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions journal December 2000
Expansion of the Gene Ontology knowledgebase and resources journal November 2016
Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies journal June 2004
BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences journal January 2016
How much does curation cost? journal January 2016
Re-thinking organisms: The impact of databases on model organism biology
  • Leonelli, Sabina; Ankeny, Rachel A.
  • Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, Vol. 43, Issue 1 https://doi.org/10.1016/j.shpsc.2011.10.003
journal March 2012
The Sol Genomics Network (SGN)—from genotype to phenotype to breeding journal November 2014
Grin-Global: an International Project to Develop a Global Plant Genebank Information Management System journal April 2010
The UCSC Genome Browser Database journal January 2003
The future of biocuration journal September 2008
Assessment of community-submitted ontology annotations from a novel database-journal partnership journal January 2012
YeastMine—an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit journal January 2012
The BioMart community portal: an innovative alternative to large, centralized data repositories journal April 2015
Measures for interoperability of phenotypic data: minimum information requirements and formatting text January 2016
The MetaCyc database of metabolic pathways and enzymes journal October 2017
The human O-GlcNAcome database and meta-analysis journal January 2021
Bridging the phenotypic and genetic data useful for integrated breeding through a data annotation using the Crop Ontology developed by the crop communities of practice journal January 2012
Corrigendum: Towards recommendations for metadata and data handling in plant phenotyping journal February 2018
Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data journal June 2017
Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases journal January 2013
Biocurators and Biocuration: surveying the 21st century challenges journal January 2012
Open data: curation is under-resourced journal October 2016
GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species journal January 2017
PubSearch and PubFetch: A Simple Management System for Semiautomated Retrieval and Annotation of Biological Information from the Literature journal March 2006

Cited By (9)

Applying FAIR Principles to Plant Phenotypic Data Management in GnpIS journal April 2019
15 years of GDR: New data and functionality in the Genome Database for Rosaceae journal October 2018
Agronomic Linked Data (AgroLD): A knowledge-based system to enable integrative biology in agronomy journal November 2018
Plant Reactome: a knowledgebase and resource for comparative pathway analysis journal November 2019
MaizeGDB 2018: the maize multi-genome genetics and genomics database journal November 2018
Cyberinfrastructure and resources to enable an integrative approach to studying forest trees journal June 2019
The future of legume genetic data resources: Challenges, opportunities, and priorities journal November 2019
Cyberinfrastructure and resources to enable an integrative approach to studying forest trees journal June 2019
Tripal v3: an ontology-based toolkit for construction of FAIR biological community databases journal January 2019

Figures / Tables (9)