DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fast and accurate metagenotyping of the human gut microbiome with GT-Pro

Journal Article · · Nature Biotechnology
 [1];  [2];  [3]; ORCiD logo [4]; ORCiD logo [5]
  1. Chan Zuckerberg Biohub, San Francisco, CA (United States); Gladstone Institutes, San Francisco, CA (United States)
  2. Chan Zuckerberg Initiative, Redwood City, CA (United States)
  3. Chan Zuckerberg Biohub, San Francisco, CA (United States)
  4. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  5. Chan Zuckerberg Biohub, San Francisco, CA (United States); Gladstone Institutes, San Francisco, CA (United States); Univ. of California, San Francisco, CA (United States)

Single nucleotide polymorphisms (SNPs) in metagenomics are used to quantify population structure, track strains and identify genetic determinants of microbial phenotypes. However, existing alignment-based approaches for metagenomic SNP detection require high-performance computing and enough read coverage to distinguish SNPs from sequencing errors. To address these issues, we developed the GenoTyper for Prokaryotes (GT-Pro), a suite of methods to catalog SNPs from genomes and use unique k-mers to rapidly genotype these SNPs from metagenomes. Compared to methods that use read alignment, GT-Pro is more accurate and two orders of magnitude faster. Here, using high-quality genomes, we constructed a catalog of 104 million SNPs in 909 human gut species and used unique k-mers targeting this catalog to characterize the global population structure of gut microbes from 7,459 samples. GT-Pro enables fast and memory-efficient metagenotyping of millions of SNPs on a personal computer.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
National Science Fund (NSF); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1894075
Journal Information:
Nature Biotechnology, Journal Name: Nature Biotechnology Journal Issue: 4 Vol. 40; ISSN 1087-0156
Publisher:
Springer NatureCopyright Statement
Country of Publication:
United States
Language:
English

References (57)

Clostridium difficile and inflammatory bowel disease journal October 2008
Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle journal January 2019
The Landscape of Genetic Content in the Gut and Oral Human Microbiome journal August 2019
Population Genetics in the Human Microbiome journal January 2020
Do bacteria have sex? journal August 2001
dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication journal July 2017
Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega journal January 2011
The Human Microbiome Project journal October 2007
Genomic variation landscape of the human gut microbiome journal December 2012
A global reference for human genetic variation journal January 2015
ConStrains identifies microbial strains in metagenomic datasets journal September 2015
Fast gapped-read alignment with Bowtie 2 journal March 2012
Metagenomic microbial community profiling using unique clade-specific marker genes journal June 2012
Accurate and universal delineation of prokaryotic species journal July 2013
Gut microbiome structure and metabolic activity in inflammatory bowel disease journal December 2018
A new genomic blueprint of the human gut microbiota journal February 2019
New insights from uncultivated genomes of the global human gut microbiome journal March 2019
1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses journal February 2019
Inferring bacterial recombination rates from large-scale sequencing datasets journal January 2019
High frequency of hotspot mutations in core genes of Escherichia coli due to short-term positive selection journal July 2009
How clonal are bacteria? journal May 1993
Fast and accurate short read alignment with Burrows-Wheeler transform journal May 2009
The Sequence Alignment/Map format and SAMtools journal June 2009
Search and clustering orders of magnitude faster than BLAST journal August 2010
RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies journal January 2014
Fast genotyping of known SNPs through approximatek-mer matching journal September 2016
KMC 3: counting and manipulating k-mer statistics journal May 2017
A novel data structure to support ultra-fast taxonomic classification of metagenomic sequences with k-mer signatures journal July 2017
Simulating Illumina metagenomic data with InSilicoSeq journal July 2018
Fast detection of maximal exact matches via fixed sampling of query K-mers and Bloom filtering of index K-mers journal April 2019
The Sequence Read Archive journal November 2010
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography journal October 2016
Microbial strain-level population structure and genetic diversity from metagenomes journal February 2017
Seasonal cycling in the gut microbiome of the Hadza hunter-gatherers of Tanzania journal August 2017
Discovery and inhibition of an interspecies gut bacterial pathway for Levodopa metabolism journal June 2019
Clostridium difficile Colonization in Early Infancy Is Accompanied by Changes in Intestinal Microbiota Composition journal March 2011
Impact of Homologous Recombination on the Evolution of Prokaryotic Core Genomes journal January 2019
BLAST+: architecture and applications journal January 2009
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
Kraken: ultrafast metagenomic sequence classification using exact alignments journal January 2014
CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers journal March 2015
The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes journal November 2014
DESMAN: a new tool for de novo extraction of strains from metagenomes journal September 2017
KrakenUniq: confident and fast metagenomics classification using unique k-mer counts journal November 2018
Second-generation PLINK: rising to the challenge of larger and richer datasets journal February 2015
Bloom and bust: intestinal microbiota dynamics in response to hospital exposures and Clostridium difficile colonization or infection journal March 2016
Impacts of florfenicol on the microbiota landscape and resistome as revealed by metagenomic analysis journal December 2019
Evolutionary dynamics of bacteria in the gut microbiome within and across hosts journal January 2019
Comprehensive DNA Signature Discovery and Validation journal May 2007
MUMmer4: A fast and versatile genome alignment system journal January 2018
MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit journal October 2012
metaSNV: A tool for metagenomic strain level analysis journal July 2017
Inference of the Properties of the Recombination Process from Whole Bacterial Genomes journal October 2013
Bloom and bust: intestinal microbiota dynamics in response to hospital exposures and Clostridium difficile colonization or infection collection January 2016
DESMAN: a new tool for de novo extraction of strains from metagenomes collection January 2017
Impacts of florfenicol on the microbiota landscape and resistome as revealed by metagenomic analysis collection January 2019

Cited By (10)

Additional file 2 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 3 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 4 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 5 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 6 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 7 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 8 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 9 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 10 of Maast: genotyping thousands of microbial strains efficiently dataset January 2023
Additional file 1 of Substantial viral diversity in bats and rodents from East Africa: insights into evolution, recombination, and cocirculation dataset January 2024