Accurate, multi-kb reads resolve complex populations and detect rare microorganisms
- Univ. of California, Berkeley, CA (United States)
- Stanford Univ., CA (United States)
- Illumina Inc. Technology Development, Hayward, CA (United States)
- USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Univ. of California, Berkeley, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Accurate evaluation of microbial communities is essential for understanding global biogeochemical processes and can guide bioremediation and medical treatments. Metagenomics is most commonly used to analyze microbial diversity and metabolic potential, but assemblies of the short reads generated by current sequencing platforms may fail to recover heterogeneous strain populations and rare organisms. Here we used short (150-bp) and long (multi-kb) synthetic reads to evaluate strain heterogeneity and study microorganisms at low abundance in complex microbial communities from terrestrial sediments. The long-read data revealed multiple (probably dozens of) closely related species and strains from previously undescribed Deltaproteobacteria and Aminicenantes (candidate phylum OP8). Notably, these are the most abundant organisms in the communities, yet short-read assemblies achieved only partial genome coverage, mostly in the form of short scaffolds (N50 = ~2200 bp). Genome architecture and metabolic potential for these lineages were reconstructed using a new synteny-based method. Analysis of long-read data also revealed thousands of species whose abundances were <0.1% in all samples. Most of the organisms in this "long tail" of rare organisms belong to phyla that are also represented by abundant organisms. Genes encoding glycosyl hydrolases are significantly more abundant than expected in rare genomes, suggesting that rare species may augment the capability for carbon turnover and confer resilience to changing environmental conditions. Overall, the study showed that a diversity of closely related strains and rare organisms account for a major portion of the communities. These are probably common features of many microbial communities and can be effectively studied using a combination of long and short reads.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Biological and Environmental Research (BER)
- Grant/Contract Number:
- AC02-05CH11231; SC0004918
- OSTI ID:
- 1512117
- Journal Information:
- Genome Research, Vol. 25, Issue 4; ISSN 1088-9051
- Publisher:
- Cold Spring Harbor Laboratory PressCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
An Improved hgcAB Primer Set and Direct High-Throughput Sequencing Expand Hg-Methylator Diversity in Nature
A method for achieving complete microbial genomes and improving bins from metagenomics data