Prediction of bacterial E3 ubiquitin ligase effectors using reduced amino acid peptide fingerprinting
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America, Department of Molecular Microbiology and Immunology, Oregon Health & Science University, Portland, OR, United States of America
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America
- Department of Molecular Microbiology and Immunology, Oregon Health & Science University, Portland, OR, United States of America
- Center for Brain Immunology and Glia, University of Virginia, Charlottesville, United States of America
Background Although pathogenic Gram-negative bacteria lack their own ubiquitination machinery, they have evolved or acquired virulence effectors that can manipulate the host ubiquitination process through structural and/or functional mimicry of host machinery. Many such effectors have been identified in a wide variety of bacterial pathogens that share little sequence similarity amongst themselves or with eukaryotic ubiquitin E3 ligases. Methods To allow identification of novel bacterial E3 ubiquitin ligase effectors from protein sequences we have developed a machine learning approach, the SVM-based Identification and Evaluation of Virulence Effector Ubiquitin ligases (SIEVE-Ub). We extend the string kernel approach used previously to sequence classification by introducing reduced amino acid (RED) alphabet encoding for protein sequences. Results We found that 14mer peptides with amino acids represented as simply either hydrophobic or hydrophilic provided the best models for discrimination of E3 ligases from other effector proteins with a receiver-operator characteristic area under the curve (AUC) of 0.90. When considering a subset of E3 ubiquitin ligase effectors that do not fall into known sequence based families we found that the AUC was 0.82, demonstrating the effectiveness of our method at identifying novel functional family members. Feature selection was used to identify a parsimonious set of 10 RED peptides that provided good discrimination, and these peptides were found to be located in functionally important regions of the proteins involved in E2 and host target protein binding. Our general approach enables construction of models based on other effector functions. We used SIEVE-Ub to predict nine potential novel E3 ligases from a large set of bacterial genomes. SIEVE-Ub is available for download at https://doi.org/10.6084/m9.figshare.7766984.v1 or https://github.com/biodataganache/SIEVE-Ub for the most current version.
- Research Organization:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- AC05-76RLO01830; AC05-76RL01830
- OSTI ID:
- 1525490
- Alternate ID(s):
- OSTI ID: 1544792
- Report Number(s):
- PNNL-SA-138492; e7055
- Journal Information:
- PeerJ, Journal Name: PeerJ Vol. 7; ISSN 2167-8359
- Publisher:
- PeerJCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
The fortedata R package: open-science datasets from a manipulative experiment testing forest resilience
Formation of wide-blocky calcite veins by extreme growth competition