DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The protein structurome of Orthornavirae and its dark matter

Journal Article · · mBio (Online)

Metatranscriptomics is uncovering more and more diverse families of viruses with RNA genomes comprising the viral kingdom Orthornavirae in the realm Riboviria. Thorough protein annotation and comparison are essential to get insights into the functions of viral proteins and virus evolution. In addition to sequence- and hmm profile-based methods, protein structure comparison adds a powerful tool to uncover protein functions and relationships. We constructed an Orthornavirae “structurome” consisting of already annotated as well as unannotated (“dark matter”) proteins and domains encoded in viral genomes. We used protein structure modeling and similarity searches to illuminate the remaining dark matter in hundreds of thousands of orthornavirus genomes. The vast majority of the dark matter domains showed either “generic” folds, such as single α-helices, or no high confidence structure predictions. Nevertheless, a variety of lineage-specific globular domains that were new either to orthornaviruses in general or to particular virus families were identified within the proteomic dark matter of orthornaviruses, including several predicted nucleic acid-binding domains and nucleases. In addition, we identified a case of exaptation of a cellular nucleoside monophosphate kinase as an RNA-binding protein in several virus families. Notwithstanding the continuing discovery of numerous orthornaviruses, it appears that all the protein domains conserved in large groups of viruses have already been identified. The rest of the viral proteome seems to be dominated by poorly structured domains including intrinsically disordered ones that likely mediate specific virus-host interactions.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
Fondation pour la Recherche Médicale (FRM); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF); USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science (BSS)
Grant/Contract Number:
AC02-05CH11231; SC0014664
OSTI ID:
2530328
Journal Information:
mBio (Online), Journal Name: mBio (Online) Journal Issue: 2 Vol. 16; ISSN 2150-7511
Publisher:
American Society for Microbiology (ASM)Copyright Statement
Country of Publication:
United States
Language:
English

References (84)

UCSF ChimeraX : Structure visualization for researchers, educators, and developers journal October 2020
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen journal January 2001
Using Dali for Protein Structure Comparison book January 2020
Poxviruses Deploy Genomic Accordions to Adapt Rapidly against Host Antiviral Defenses journal August 2012
Expansion of the global RNA virome reveals diverse clades of bacteriophages journal October 2022
Using artificial intelligence to document the hidden RNA virosphere journal November 2024
The logic of virus evolution journal July 2022
Structure and Function of Viral Deubiquitinating Enzymes journal November 2017
Coronavirus RNA Proofreading: Molecular Basis and Therapeutic Targeting journal September 2020
AlphaFold illuminates half of the dark human proteins journal June 2022
Structure of Arabidopsis HYPONASTIC LEAVES1 and Its Molecular Implications for miRNA Processing journal May 2010
Structure of the EndoMS-DNA Complex as Mismatch Restriction Endonuclease journal November 2016
Description and initial characterization of metatranscriptomic nidovirus-like genomes from the proposed new family Abyssoviridae, and from a sister group to the Coronavirinae, the proposed genus Alphaletovirus journal November 2018
New and old roles of the double-stranded RNA-binding domain journal October 2002
PSIQUE: Protein Secondary Structure Identification on the Basis of Quaternions and Electronic Structure Calculations journal March 2021
Structure and function of a novel endonuclease acting on branched DNA substrates journal July 2009
The cap-snatching endonuclease of influenza virus polymerase resides in the PA subunit journal February 2009
Cap binding and immune evasion revealed by Lassa nucleoprotein structure journal November 2010
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets journal October 2017
ModelFinder: fast model selection for accurate phylogenetic estimates journal May 2017
Doubling of the known set of RNA viruses by metagenomic analysis of an aquatic virome journal July 2020
Double-stranded RNA sequencing reveals distinct riboviruses associated with thermoacidophilic bacteria from hot springs in Japan journal January 2024
Highly accurate protein structure prediction with AlphaFold journal July 2021
Petabase-scale sequence alignment catalyses viral discovery journal January 2022
Birth of protein folds and functions in the virome journal August 2024
Sensitive protein alignments at tree-of-life scale using DIAMOND journal April 2021
Major genetic marker of nidoviruses encodes a replicative endoribonuclease journal August 2004
Discovery of an RNA virus 3'->5' exoribonuclease that is critically involved in coronavirus RNA synthesis journal March 2006
Structure of the Lassa virus nucleoprotein reveals a dsRNA-specific 3′ to 5′ exonuclease activity essential for immune suppression journal January 2011
Multiple origins of viral capsid proteins from cellular ancestors journal March 2017
Picobirnaviruses encode proteins that are functional bacterial lysins journal September 2023
A ~40-kb flavi-like virus does not encode a known error-correcting mechanism journal July 2024
Natural history of eukaryotic DNA viruses with double jelly-roll major capsid proteins journal May 2024
Structures of Arenaviral Nucleoproteins with Triphosphate dsRNA Reveal a Unique Mechanism of Immune Suppression journal June 2013
RNA-binding Domain of the Key Structural Protein P7 for the Rice dwarf virus Particle Assembly journal January 2005
Protein homology detection by HMM-HMM comparison journal November 2004
MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins journal January 2017
IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era journal February 2020
Ultrafast Approximation for Phylogenetic Bootstrap journal February 2013
UFBoot2: Improving the Ultrafast Bootstrap Approximation journal October 2017
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs journal September 1997
The Protein Data Bank journal January 2000
Pfam: The protein families database in 2021 journal October 2020
SCOPe: improvements to the structural classification of proteins – extended database to facilitate variant interpretation and machine learning journal December 2021
The conserved domain database in 2023 journal December 2022
AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences journal November 2023
Interactive Tree of Life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool journal April 2024
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform journal July 2002
Viral AlkB proteins repair RNA damage by oxidative demethylation journal August 2008
SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures journal December 2013
Identification of a mismatch-specific endonuclease in hyperthermophilic Archaea journal March 2016
Functional domain annotation by structural similarity journal January 2024
Sequence analysis and product assignment of segment 7 of the rice dwarf virus genome journal March 1990
ICTV Virus Taxonomy Profile: Marnaviridae 2021 journal August 2021
ICTV Virus Taxonomy Profile: Rhabdoviridae 2022 journal June 2022
ICTV Virus Taxonomy Profile: Solemoviridae 2021 journal December 2021
Foldseek: fast and accurate protein structure search posted_content March 2023
Accurate prediction of protein structures and interactions using a three-track neural network journal July 2021
Cryptic and abundant marine viruses at the evolutionary origins of Earth’s RNA virome journal April 2022
Evolutionary-scale prediction of atomic-level protein structure with a language model journal March 2023
Identification of the Galactose Binding Domain of the Adeno-Associated Virus Serotype 9 Capsid journal April 2012
Structural and Functional Basis for ADP-Ribose and Poly(ADP-Ribose) Binding by Viral Macro Domains journal September 2006
Nonstructural Protein 11 of Porcine Reproductive and Respiratory Syndrome Virus Induces STAT2 Degradation To Inhibit Interferon Signaling journal November 2019
Site-Directed Mutagenesis of the Nidovirus Replicative Endoribonuclease NendoU Exerts Pleiotropic Effects on the Arterivirus Life Cycle journal February 2006
Global Organization and Proposed Megataxonomy of the Virus World journal March 2020
The protein structurome of Orthornavirae and its dark matter journal February 2025
Graph Clustering Via a Discrete Uncoupling Process journal January 2008
BLAST+: architecture and applications journal January 2009
MUSCLE: a multiple sequence alignment method with reduced time and space complexity journal August 2004
HH-suite3 for fast remote homology detection and deep protein annotation journal September 2019
Cross-phyla protein annotation by structural prediction and alignment journal May 2023
Sustainable data analysis with Snakemake journal April 2021
ECOD: An Evolutionary Classification of Protein Domains journal December 2014
The structural coverage of the human proteome before and after AlphaFold journal January 2022
Metagenomic sequencing suggests a diversity of RNA interference-like responses to viruses across multicellular eukaryotes journal July 2018
SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation journal October 2016
Viral OTU Deubiquitinases: A Structural and Functional Comparison journal March 2014
Evolution of Genome Size and Complexity in the Rhabdoviridae journal February 2015
A planarian nidovirus expands the limits of RNA genome size journal November 2018
Gephi: An Open Source Software for Exploring and Manipulating Networks journal March 2009
Opportunities and Challenges of Data-Driven Virus Discovery journal August 2022
Methyltransferases of Riboviria journal September 2022
Data for "The protein structurome of Orthornavirae and its dark matter" dataset January 2024
Expansion of the global RNA virome reveals diverse clades of bacteriophages dataset January 2022

Similar Records

Illuminating structural proteins in viral "dark matter" with metaproteomics
Journal Article · 2016 · Proceedings of the National Academy of Sciences of the United States of America · OSTI ID:1287026

Structure and Function of the N-Terminal Domain of the Vesicular Stomatitis Virus RNA Polymerase
Journal Article · 2015 · Journal of Virology · OSTI ID:1242301

Structure of saguaro cactus virus 3′ translational enhancer mimics 5′ cap for eIF4E binding
Journal Article · 2024 · Proceedings of the National Academy of Sciences of the United States of America · OSTI ID:2470196