Ultrahigh-resolution mass spectrometry data associated with the manuscript “A functional microbiome catalog crowdsourced from North American rivers"
- Pacific Northwest National Laboratory (PNNL); Pacific Northwest National Laboratory (PNNL)
- Colorado State University
- Pacific Northwest National Laboratory (PNNL)
This data package is associated with the publication “A functional microbiome catalog crowdsourced from North American rivers” submitted to Nature (Borton et al., 2024); (https://www.biorxiv.org/content/10.1101/2023.07.22.550117v1). Predicting elemental cycles and maintaining water quality under increasing anthropogenic influence requires understanding the spatial drivers of river microbiomes. However, the unifying microbial determinants governing river biogeochemistry are hindered by a lack of genome-resolved functional insights and sampling across multiple rivers. Here we employed a community science effort to accelerate the sampling of river microbiomes to create the Genome Resolved Open Watersheds database (GROWdb). GROWdb is a publicly available resource that paves the way for watershed predictive modeling and microbiome-based management practices. This resource profiled the identity, distribution, function, and expression of thousands of microbial genomes across rivers covering 90% of United States watersheds. We identified the most cosmopolitan microbiome members, while also revealing local drivers of strain endemism across ecological dimensions. We provide the first evidence that microbial functional trait expression followed the tenets of the River Continuum Concept, suggesting the structure and function of river microbiomes is predictable. The Fourier-transform ion cyclotron resonance mass spectrometry (FTICR-MS) data were one of many different data types used in establishing the ecological dimensions along which different microbes were detected .This data package only contains the processed FTICR-MS data associated with this manuscript; all other data is accessible via Zenodo (https://zenodo.org/records/8173287), GitHub (https://github.com/jmikayla1991/Genome-Resolved-Open-Watersheds-database-GROWdb), KBase (https://doi.org/10.25982/109073.30/1895615), and NCBI via Bioproject PRJNA946291.This dataset consists of (1) a file-level metadata (flmd) file; (2) a data dictionary (dd) file; (3) a readme; (4) three Fourier-transform ion cyclotron resonance mass spectrometry (FTICR-MS) processed data files (a ‘data’ file containing peak-by-sample observations, a ‘mol’ file containing peak metadata, and a transformation profile containing transformation-by-sample observations). All files are .csv or .pdf.
- Research Organization:
- Environmental System Science Data Infrastructure for a Virtual Ecosystem; River Corridor and Watershed Biogeochemistry SFA
- Sponsoring Organization:
- U.S. DOE > Office of Science > Biological and Environmental Research (BER)
- OSTI ID:
- 2439202
- Country of Publication:
- United States
- Language:
- English
Similar Records
Related Subjects
Biogeochemistry
ESS-DIVE CSV File Formatting Guidelines Reporting Format
ESS-DIVE File Level Metadata Reporting Format
Environmental metabolomics
FTICR-MS
Fourier-transform ion cyclotron resonance
Freshwater
Mass spectrometry
Organic matter
River
River corridor
Stream
WHONDRS
Watershed