Metagenomic gene annotation by a homology-independent approach
Conference
·
OSTI ID:1050667
Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.
- Research Organization:
- Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
- Sponsoring Organization:
- Genomics Division
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1050667
- Report Number(s):
- LBNL-4833E-Poster
- Country of Publication:
- United States
- Language:
- English
Similar Records
Improvement of eukaryotic protein predictions from soil metagenomes
TheViral MetaGenome Annotation Pipeline (VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data
Journal Article
·
Wed Jun 15 20:00:00 EDT 2022
· Scientific Data
·
OSTI ID:1904105
TheViral MetaGenome Annotation Pipeline (VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data
Journal Article
·
Wed Jun 29 20:00:00 EDT 2011
· Standards in Genomic Sciences
·
OSTI ID:1628653
Related Subjects
59 BASIC BIOLOGICAL SCIENCES
60 APPLIED LIFE SCIENCES
97 MATHEMATICS AND COMPUTING
99 GENERAL AND MISCELLANEOUS
ACCURACY
CARBONIC ANHYDRASE
CELLULASE
COWS
CULTIVATION
FUNCTIONALS
GENES
GENETICS
MUTATIONS
PIPELINES
PROTEINS
RUMINANTS
STOMACH
VELOCITY
functional annotation
deep metagenome sequencing
rhModeller
homology
metagenomic gene annotation
missense
frameshift mutations
60 APPLIED LIFE SCIENCES
97 MATHEMATICS AND COMPUTING
99 GENERAL AND MISCELLANEOUS
ACCURACY
CARBONIC ANHYDRASE
CELLULASE
COWS
CULTIVATION
FUNCTIONALS
GENES
GENETICS
MUTATIONS
PIPELINES
PROTEINS
RUMINANTS
STOMACH
VELOCITY
functional annotation
deep metagenome sequencing
rhModeller
homology
metagenomic gene annotation
missense
frameshift mutations