DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning predicts new anti-CRISPR proteins

Abstract

The increasing use of CRISPR–Cas9 in medicine, agriculture, and synthetic biology has accelerated the drive to discover new CRISPR–Cas inhibitors as potential mechanisms of control for gene editing applications. Many anti-CRISPRs have been found that inhibit the CRISPR–Cas adaptive immune system. However, comparing all currently known anti-CRISPRs does not reveal a shared set of properties for facile bioinformatic identification of new anti-CRISPR families. Here, we describe AcRanker, a machine learning based method to aid direct identification of new potential anti-CRISPRs using only protein sequence information. Using a training set of known anti-CRISPRs, we built a model based on XGBoost ranking. We then applied AcRanker to predict candidate anti-CRISPRs from predicted prophage regions within self-targeting bacterial genomes and discovered two previously unknown anti-CRISPRs: AcrllA20 (ML1) and AcrIIA21 (ML8). We show that AcrIIA20 strongly inhibits Streptococcus iniae Cas9 (SinCas9) and weakly inhibits Streptococcus pyogenes Cas9 (SpyCas9). We also show that AcrIIA21 inhibits SpyCas9, Streptococcus aureus Cas9 (SauCas9) and SinCas9 with low potency. The addition of AcRanker to the anti-CRISPR discovery toolkit allows researchers to directly rank potential anti-CRISPR candidate genes for increased speed in testing and validation of new anti-CRISPRs. A web server implementation for AcRanker is available online at http://acranker.pythonanywhere.com/.

Authors:
 [1];  [2];  [1];  [1];  [1]; ORCiD logo [3];  [4]
  1. Univ. of California, Berkeley, CA (United States)
  2. Pakistan Institute of Engineering and Applied Sciences (PIEAS), PO Nilore, Islamabad (Pakistan); National University of Computer and Emerging Sciences (NUCES), Islamabad (Pakistan). FAST School of Computing
  3. Univ. of California, Berkeley, CA (United States). Innovative Genomics Institute and Howard Hughes Medical Institute; Gladstone Institutes, San Francisco, CA (United States). Institute of Data Science and Biotechnology; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  4. Pakistan Institute of Engineering and Applied Sciences (PIEAS), PO Nilore, Islamabad (Pakistan); University of Warwick, Coventry (United Kingdom)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC); Defense Advanced Research Projects Agency (DARPA); National Science Foundation (NSF); The Paul G. Allen Frontiers Group; National Institutes of Health (NIH)
OSTI Identifier:
1633284
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Nucleic Acids Research
Additional Journal Information:
Journal Volume: 48; Journal Issue: 9; Journal ID: ISSN 0305-1048
Publisher:
Oxford University Press
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Eitzinger, Simon, Asif, Amina, Watters, Kyle E., Iavarone, Anthony T., Knott, Gavin J., Doudna, Jennifer A., and Minhas, Fayyaz ul Amir Afsar. Machine learning predicts new anti-CRISPR proteins. United States: N. p., 2020. Web. doi:10.1093/nar/gkaa219.
Eitzinger, Simon, Asif, Amina, Watters, Kyle E., Iavarone, Anthony T., Knott, Gavin J., Doudna, Jennifer A., & Minhas, Fayyaz ul Amir Afsar. Machine learning predicts new anti-CRISPR proteins. United States. https://doi.org/10.1093/nar/gkaa219
Eitzinger, Simon, Asif, Amina, Watters, Kyle E., Iavarone, Anthony T., Knott, Gavin J., Doudna, Jennifer A., and Minhas, Fayyaz ul Amir Afsar. Tue . "Machine learning predicts new anti-CRISPR proteins". United States. https://doi.org/10.1093/nar/gkaa219. https://www.osti.gov/servlets/purl/1633284.
@article{osti_1633284,
title = {Machine learning predicts new anti-CRISPR proteins},
author = {Eitzinger, Simon and Asif, Amina and Watters, Kyle E. and Iavarone, Anthony T. and Knott, Gavin J. and Doudna, Jennifer A. and Minhas, Fayyaz ul Amir Afsar},
abstractNote = {The increasing use of CRISPR–Cas9 in medicine, agriculture, and synthetic biology has accelerated the drive to discover new CRISPR–Cas inhibitors as potential mechanisms of control for gene editing applications. Many anti-CRISPRs have been found that inhibit the CRISPR–Cas adaptive immune system. However, comparing all currently known anti-CRISPRs does not reveal a shared set of properties for facile bioinformatic identification of new anti-CRISPR families. Here, we describe AcRanker, a machine learning based method to aid direct identification of new potential anti-CRISPRs using only protein sequence information. Using a training set of known anti-CRISPRs, we built a model based on XGBoost ranking. We then applied AcRanker to predict candidate anti-CRISPRs from predicted prophage regions within self-targeting bacterial genomes and discovered two previously unknown anti-CRISPRs: AcrllA20 (ML1) and AcrIIA21 (ML8). We show that AcrIIA20 strongly inhibits Streptococcus iniae Cas9 (SinCas9) and weakly inhibits Streptococcus pyogenes Cas9 (SpyCas9). We also show that AcrIIA21 inhibits SpyCas9, Streptococcus aureus Cas9 (SauCas9) and SinCas9 with low potency. The addition of AcRanker to the anti-CRISPR discovery toolkit allows researchers to directly rank potential anti-CRISPR candidate genes for increased speed in testing and validation of new anti-CRISPRs. A web server implementation for AcRanker is available online at http://acranker.pythonanywhere.com/.},
doi = {10.1093/nar/gkaa219},
journal = {Nucleic Acids Research},
number = 9,
volume = 48,
place = {United States},
year = {Tue Apr 14 00:00:00 EDT 2020},
month = {Tue Apr 14 00:00:00 EDT 2020}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Table 1 Table 1: Results for leave-one-out cross-validation

Save / Share:

Works referenced in this record:

An anti-CRISPR from a virulent streptococcal phage inhibits Streptococcus pyogenes Cas9
journal, August 2017

  • Hynes, Alexander P.; Rousseau, Geneviève M.; Lemay, Marie-Laurence
  • Nature Microbiology, Vol. 2, Issue 10
  • DOI: 10.1038/s41564-017-0004-7

The new frontier of genome engineering with CRISPR-Cas9
journal, November 2014


Structural Basis for the Inhibition of CRISPR-Cas12a by Anti-CRISPR Proteins
journal, June 2019


CRISPR/Cas9 for genome editing: progress, implications and challenges
journal, March 2014

  • Zhang, F.; Wen, Y.; Guo, X.
  • Human Molecular Genetics, Vol. 23, Issue R1
  • DOI: 10.1093/hmg/ddu125

Structural basis for AcrVA4 inhibition of specific CRISPR-Cas12a
journal, August 2019


Meet the Anti-CRISPRs: Widespread Protein Inhibitors of CRISPR-Cas Systems
journal, February 2019


Predicting protein function by machine learning on amino acid sequences – a critical evaluation
journal, January 2007

  • Al-Shahib, Ali; Breitling, Rainer; Gilbert, David R.
  • BMC Genomics, Vol. 8, Issue 1
  • DOI: 10.1186/1471-2164-8-78

A New Group of Phage Anti-CRISPR Genes Inhibits the Type I-E CRISPR-Cas System of Pseudomonas aeruginosa
journal, April 2014

  • Pawluk, April; Bondy-Denomy, Joseph; Cheung, Vivian H. W.
  • mBio, Vol. 5, Issue 2
  • DOI: 10.1128/mBio.00896-14

CRISPR: gene editing is just the beginning
journal, March 2016


Anti-CRISPR proteins encoded by archaeal lytic viruses inhibit subtype I-D immunity
journal, March 2018


Twilight zone of protein sequence alignments
journal, February 1999


Structural insights into the inactivation of CRISPR-Cas systems by diverse anti-CRISPR proteins
journal, March 2018


The Discovery, Mechanisms, and Evolutionary Impact of Anti-CRISPRs
journal, September 2017


Inhibition of CRISPR-Cas9 with Bacteriophage Proteins
journal, January 2017


XGBoost: A Scalable Tree Boosting System
conference, January 2016

  • Chen, Tianqi; Guestrin, Carlos
  • Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16
  • DOI: 10.1145/2939672.2939785

PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016

  • Arndt, David; Grant, Jason R.; Marcu, Ana
  • Nucleic Acids Research, Vol. 44, Issue W1
  • DOI: 10.1093/nar/gkw387

An anti-CRISPR protein disables type V Cas12a by acetylation
journal, April 2019

  • Dong, Liyong; Guan, Xiaoyu; Li, Ningning
  • Nature Structural & Molecular Biology, Vol. 26, Issue 4
  • DOI: 10.1038/s41594-019-0206-1

Disabling Cas9 by an anti-CRISPR DNA mimic
journal, July 2017


Anti-CRISPRdb: a comprehensive online resource for anti-CRISPR proteins
journal, September 2017

  • Dong, Chuan; Hao, Ge-Fei; Hua, Hong-Li
  • Nucleic Acids Research, Vol. 46, Issue D1
  • DOI: 10.1093/nar/gkx835

Widespread anti-CRISPR proteins in virulent bacteriophages inhibit a range of Cas9 proteins
journal, July 2018

  • Hynes, Alexander P.; Rousseau, Geneviève M.; Agudelo, Daniel
  • Nature Communications, Vol. 9, Issue 1
  • DOI: 10.1038/s41467-018-05092-w

CRISPR/Cas9: A powerful tool for crop genome editing
journal, April 2016


Crystal structure of an anti-CRISPR protein, AcrIIA1
journal, November 2017

  • Ka, Donghyun; An, So Young; Suh, Jeong-Yong
  • Nucleic Acids Research, Vol. 46, Issue 1
  • DOI: 10.1093/nar/gkx1181

Disabling a Type I-E CRISPR-Cas Nuclease with a Bacteriophage-Encoded Anti-CRISPR Protein
journal, December 2017


The Anti-CRISPR Story: A Battle for Survival
journal, October 2017


Bacteriophage genes that inactivate the CRISPR/Cas bacterial immune system
journal, December 2012

  • Bondy-Denomy, Joe; Pawluk, April; Maxwell, Karen L.
  • Nature, Vol. 493, Issue 7432
  • DOI: 10.1038/nature11723

Anti-CRISPR AcrIIA5 Potently Inhibits All Cas9 Homologs Used for Genome Editing
journal, November 2019


The roles of CRISPR–Cas systems in adaptive immunity and beyond
journal, February 2015


Anti-CRISPR: discovery, mechanism and function
journal, October 2017

  • Pawluk, April; Davidson, Alan R.; Maxwell, Karen L.
  • Nature Reviews Microbiology, Vol. 16, Issue 1
  • DOI: 10.1038/nrmicro.2017.120

Inhibition Mechanism of an Anti-CRISPR Suppressor AcrIIA4 Targeting SpyCas9
journal, July 2017


CRISPR/Cas9-Mediated Genome Editing of Herpesviruses Limits Productive and Latent Infections
journal, June 2016


Temperature-Responsive Competitive Inhibition of CRISPR-Cas9
journal, February 2019


A flavin-based extracellular electron transfer mechanism in diverse Gram-positive bacteria
journal, September 2018


Multiple instance learning of Calmodulin binding sites
journal, September 2012


Systematic discovery of natural CRISPR-Cas12a inhibitors
journal, September 2018


Inactivation of CRISPR-Cas systems by anti-CRISPR proteins in diverse bacterial species
journal, June 2016


Multiple mechanisms for CRISPR–Cas inhibition by anti-CRISPR proteins
journal, September 2015

  • Bondy-Denomy, Joseph; Garcia, Bianca; Strum, Scott
  • Nature, Vol. 526, Issue 7571
  • DOI: 10.1038/nature15254

Two distinct RNase activities of CRISPR-C2c2 enable guide-RNA processing and RNA detection
journal, September 2016

  • East-Seletsky, Alexandra; O’Connell, Mitchell R.; Knight, Spencer C.
  • Nature, Vol. 538, Issue 7624
  • DOI: 10.1038/nature19802

Staphylococcus aureus Cas9 is a multiple-turnover enzyme
journal, October 2018


A Broad-Spectrum Inhibitor of CRISPR-Cas9
journal, September 2017


CRISPR/Cas, the Immune System of Bacteria and Archaea
journal, January 2010


Inhibition of Type III CRISPR-Cas Immunity by an Archaeal Virus-Encoded Anti-CRISPR Protein
journal, October 2019


Correct machine learning on protein sequences: a peer-reviewing perspective
journal, September 2015

  • Walsh, Ian; Pollastri, Gianluca; Tosatto, Silvio C. E.
  • Briefings in Bioinformatics, Vol. 17, Issue 5
  • DOI: 10.1093/bib/bbv082

Basic local alignment search tool
journal, October 1990

  • Altschul, Stephen F.; Gish, Warren; Miller, Webb
  • Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
  • DOI: 10.1016/S0022-2836(05)80360-2

Potent CRISPR-Cas9 inhibitors from Staphylococcus genomes
journal, March 2020

  • Watters, Kyle E.; Shivram, Haridha; Fellmann, Christof
  • Proceedings of the National Academy of Sciences, Vol. 117, Issue 12
  • DOI: 10.1073/pnas.1917668117

Anti-CRISPRs on the march
journal, October 2018


Protein sequences classification by means of feature extraction with substitution matrices
journal, January 2010

  • Saidi, Rabie; Maddouri, Mondher; Mephu Nguifo, Engelbert
  • BMC Bioinformatics, Vol. 11, Issue 1
  • DOI: 10.1186/1471-2105-11-175

Discovery of widespread type I and type V CRISPR-Cas inhibitors
journal, September 2018


Predicting protein-protein interactions based only on sequences information
journal, March 2007

  • Shen, J.; Zhang, J.; Luo, X.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 11
  • DOI: 10.1073/pnas.0607879104

Functional metagenomics-guided discovery of potent Cas9 inhibitors in the human microbiome
journal, September 2019

  • Forsberg, Kevin J.; Bhatt, Ishan V.; Schmidtke, Danica T.
  • eLife, Vol. 8
  • DOI: 10.7554/eLife.46540

In vivo genome editing using Staphylococcus aureus Cas9
journal, April 2015


Discovery and Characterization of Cas9 Inhibitors Disseminated across Seven Bacterial Phyla
journal, February 2019


CD-HIT Suite: a web server for clustering and comparing biological sequences
journal, January 2010


CRISPR-Cas guides the future of genetic engineering
journal, August 2018


Broad-spectrum enzymatic inhibition of CRISPR-Cas12a
journal, April 2019

  • Knott, Gavin J.; Thornton, Brittney W.; Lobba, Marco J.
  • Nature Structural & Molecular Biology, Vol. 26, Issue 4
  • DOI: 10.1038/s41594-019-0208-z

A Unified Resource for Tracking Anti-CRISPR Names
journal, October 2018

  • Bondy-Denomy, Joseph; Davidson, Alan R.; Doudna, Jennifer A.
  • The CRISPR Journal, Vol. 1, Issue 5
  • DOI: 10.1089/crispr.2018.0043

Discovery and Characterization of Cas9 Inhibitors Disseminated across Seven Bacterial Phyla
journal, November 2019


The Spectrum Kernel: a String Kernel for svm Protein Classification
conference, November 2011

  • Leslie, Christina; Eskin, Eleazar; Noble, William Stafford
  • Proceedings of the Pacific Symposium, Biocomputing 2002
  • DOI: 10.1142/9789812799623_0053

A flavin-based extracellular electron transfer mechanism in diverse Gram-positive bacteria.
text, January 2018

  • Light, Samuel H.; Su, Lin; Rivera-Lugo, Rafael
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.70978

Listeria Phages Induce Cas9 Degradation to Protect Lysogenic Genomes
journal, July 2020

  • Osuna, Beatriz A.; Karambelkar, Shweta; Mahendra, Caroline
  • Cell Host & Microbe, Vol. 28, Issue 1
  • DOI: 10.1016/j.chom.2020.04.001

CRISPR/Cas9: A powerful tool for crop genome editing
journal, April 2016


Exporting electrons
journal, September 2018


Broad-spectrum enzymatic inhibition of CRISPR-Cas12a
journal, April 2019

  • Knott, Gavin J.; Thornton, Brittney W.; Lobba, Marco J.
  • Nature Structural & Molecular Biology, Vol. 26, Issue 4
  • DOI: 10.1038/s41594-019-0208-z

PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016

  • Arndt, David; Grant, Jason R.; Marcu, Ana
  • Nucleic Acids Research, Vol. 44, Issue W1
  • DOI: 10.1093/nar/gkw387

Discovery of widespread type I and type V CRISPR-Cas inhibitors
journal, September 2018


Friendly Fire: Biological Functions and Consequences of Chromosomal Targeting by CRISPR-Cas Systems
journal, February 2016

  • Heussler, Gary E.; O'Toole, George A.
  • Journal of Bacteriology, Vol. 198, Issue 10
  • DOI: 10.1128/jb.00086-16

Works referencing / citing this record:

PaCRISPR: a server for predicting and visualizing anti-CRISPR proteins
journal, May 2020

  • Wang, Jiawei; Dai, Wei; Li, Jiahui
  • Nucleic Acids Research, Vol. 48, Issue W1
  • DOI: 10.1093/nar/gkaa432

Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.