Strategies for community-sourced biocuration in bioinformatics: a case study on MIBiG 4.0
Journal Article
·
· Briefings in Bioinformatics
- Technical Univ. of Denmark, Lyngby (Denmark)
- Wageningen Univ. & Research (Netherlands)
- Flanders Institute for Biotechnology (VIB), Leuven (Belgium); Katholieke Univ. Leuven, Heverlee (Belgium)
- Swiss Federal Institute of Aquatic Science and Technology, Duebendorf (Switzerland)
- Eidgenoessische Technische Hochschule (ETH), Zurich (Switzerland)
- Univ. of California, Santa Barbara, CA (United States)
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States); USDOE Joint Genome Institute (JGI), Berkeley, CA (United States)
- Wageningen Univ. & Research (Netherlands); Univ. of Johannesburg (South Africa)
Biocuration is essential to transform molecular sequence data into standardized, machine-readable resources. Such curated datasets enable comparative analysis, predictive modeling, and data integration across bioinformatics platforms. While professional biocuration is resource-intensive and usually limited to institutional settings, community-driven approaches can mobilize large-scale annotation of specialized datasets and are more resilient to disruptions in scientific funding. Here, we present a model for community-powered curation applied to the Minimum Information about a Biosynthetic Gene Cluster (MIBiG) repository. Through a framework of workflows for metadata capture, annotation validation, and contributor coordination, the MIBiG 4.0 initiative recruited 267 scientists across 178 institutions from 33 countries, volunteering an estimated 4000 h of work. These efforts expanded the MIBiG repository by 22% and enhanced its usability in downstream molecular data analyses in comparative genomic analyses, natural product discovery, and machine learning applications. We provide strategies and actionable lessons for adopting this model, supporting the sustainability of curated bioinformatics resources central to nucleic acid research and related fields.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 3013841
- Alternate ID(s):
- OSTI ID: 3013965
- Journal Information:
- Briefings in Bioinformatics, Journal Name: Briefings in Bioinformatics Journal Issue: 6 Vol. 26; ISSN 1467-5463; ISSN 1477-4054
- Publisher:
- Oxford University PressCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
MIBiG 4.0: advancing biosynthetic gene cluster curation through global collaboration
Journal Article
·
Sun Dec 08 19:00:00 EST 2024
· Nucleic Acids Research
·
OSTI ID:2481018