DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Toward Guided Mutagenesis: Gaussian Process Regression Predicts MHC Class II Antigen Mutant Binding

Journal Article · · Journal of Chemical Information and Modeling
ORCiD logo [1]; ORCiD logo [2]
  1. Frederick National Laboratory for Cancer Research, Frederick, MD (United States). Advanced Biomedical Computational Science
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

Antigen-specific immunotherapies (ASI) require successful loading and presentation of antigen peptides into the major histocompatibility complex (MHC) binding cleft. One route of ASI design is to mutate native antigens for either stronger or weaker binding interaction to MHC. Exploring all possible mutations is costly both experimentally and computationally. To reduce experimental and computational expense, here we investigate the minimal amount of prior data required to accurately predict the relative binding affinity of point mutations for peptide-MHC class II (pMHCII) binding. Using data from different residue subsets, we interpolate pMHCII mutant binding affinities by Gaussian process (GP) regression of residue volume and hydrophobicity. We apply GP regression to an experimental data set from the Immune Epitope Database, and theoretical data sets from NetMHCIIpan and Free Energy Perturbation calculations. We find that GP regression can predict binding affinities of nine neutral residues from a six-residue subset with an average R2 coefficient of determination value of 0.62 ± 0.04 (±95% CI), average error of 0.09 ± 0.01 kcal/mol (±95% CI), and with an receiver operating characteristic (ROC) AUC value of 0.92 for binary classification of enhanced or diminished binding affinity. Similarly, metrics increase to an R2 value of 0.69 ± 0.04, average error of 0.07 ± 0.01 kcal/mol, and an ROC AUC value of 0.94 for predicting seven neutral residues from an eight-residue subset. Our work finds that prediction is most accurate for neutral residues at anchor residue sites without register shift. This work holds relevance to predicting pMHCII binding and accelerating ASI design.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1814311
Journal Information:
Journal of Chemical Information and Modeling, Journal Name: Journal of Chemical Information and Modeling Journal Issue: TBD Vol. TBD; ISSN 1549-9596
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (62)

Modeling and prediction of binding affinities between the human amphiphysin SH3 domain and its peptide ligands using genetic algorithm-Gaussian processes journal January 2008
Predicting Lipophilicity of Drug-Discovery Molecules using Gaussian Process Models journal September 2007
Comprehensive analysis of sequences of a protein switch: Analysis of Sequences of a Protein Switch journal July 2015
Incorporation of non-standard amino acids into proteins: challenges, recent achievements, and emerging applications journal February 2019
Protein volume in solution journal January 1972
Memory B Cells Activate Brain-Homing, Autoreactive CD4+ T Cells in Multiple Sclerosis journal September 2018
A Public BCR Present in a Unique Dual-Receptor-Expressing Lymphocyte from Type 1 Diabetes Patients Encodes a Potent T Cell Autoantigen journal May 2019
MHCflurry: Open-Source Class I MHC Binding Affinity Prediction journal July 2018
Leveraging Uncertainty in Machine Learning Accelerates Biological Discovery and Design journal November 2020
Active-learning strategies in computer-assisted drug discovery journal April 2015
Novel Nondietary Therapies for Celiac Disease journal January 2019
MHC–Peptide Binding is Assisted by Bound Water Molecules journal April 2004
T-scale as a novel vector of topological descriptors for amino acids and its application in QSARs of peptides journal March 2007
Improved Descriptors for the Quantitative Structure–Activity Relationship Modeling of Peptides and Proteins journal January 2018
Gaussian Process Modeling of Protein Turnover journal June 2016
Improved Prediction of MHC II Antigen Presentation through Integration and Motif Deconvolution of Mass Spectrometry MHC Eluted Ligand Data journal April 2020
Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins journal August 2018
Quantitative Structure−Activity Relationship Studies Using Gaussian Processes journal May 2001
Nonlinear Prediction of Quantitative Structure−Activity Relationships journal September 2004
Inhibition of HLA-DQ2-Mediated Antigen Presentation by Analogues of a High Affinity 33-Residue Peptide from α2-Gliadin journal February 2006
MHC class II proteins and disease: a structural perspective journal April 2006
Immunopathology of multiple sclerosis journal August 2015
Antigen-specific immunotherapies in rheumatic diseases journal July 2017
Improved HLA-based prediction of coeliac disease identifies two novel genetic interactions journal July 2020
Imaging mechanism for hyperspectral scanning probe microscopy via Gaussian process modelling journal March 2020
Predicting HLA class II antigen presentation through integrated deep learning journal October 2019
Robust prediction of HLA class II epitopes by deep motif deconvolution of immunopeptidomes journal October 2019
The complex and specific pMHC interactions with diverse HIV-1 TCR clonotypes reveal a structural basis for alterations in CTL function journal February 2014
Charging nanoparticles: increased binding of Gd@C 82 (OH) 22 derivatives to human MMP-9 journal January 2018
Graphene-extracted membrane lipids facilitate the activation of integrin α v β 8 journal January 2020
The energy landscape of a protein switch journal January 2014
Scalable molecular dynamics on CPU and GPU architectures with NAMD journal July 2020
Navigating the protein fitness landscape with Gaussian processes journal December 2012
C-terminal modification of the insulin B:11–23 peptide creates superagonists in mouse and human type 1 diabetes journal December 2017
Amino acid substitution matrices from protein blocks. journal November 1992
Structural basis for major histocompatibility complex (MHC)-linked susceptibility to autoimmunity: charged residues of a single MHC binding pocket confer selective presentation of self-peptides in pemphigus vulgaris. journal December 1995
Identification of immunodominant T cell epitopes of human glutamic acid decarboxylase 65 by using HLA-DR( 1*0101, 1*0401) transgenic mice journal July 1997
Assigning confidence to molecular property prediction journal June 2021
Register shifting of an insulin peptide–MHC complex allows diabetogenic T cells to escape thymic deletion journal November 2011
Interpretable Numerical Descriptors of Amino Acid Space journal May 2009
Current status and future challenges in T-cell receptor/peptide/MHC molecular dynamics simulations journal February 2015
mCSM: predicting the effects of mutations in proteins using graph-based signatures journal November 2013
mGPfusion: predicting protein stability changes with Gaussian process kernel learning and data fusion journal June 2018
NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data journal May 2020
The Immune Epitope Database (IEDB): 2018 update journal October 2018
Using Gaussian Process with Test Rejection to Detect T-Cell Epitopes in Pathogen Genomes journal October 2010
How C-terminal additions to insulin B-chain fragments create superagonists for T cells in mouse and human type 1 diabetes journal April 2019
Variations in MHC Class II Antigen Processing and Presentation in Health and Disease journal May 2016
Three-Dimensional Structure of Membrane and Surface Proteins journal June 1984
MultiRTA: A simple yet reliable method for predicting peptide binding affinities for multiple class II MHC allotypes journal September 2010
50 years of amino acid hydrophobicity scales: revisiting the capacity for peptide classification journal July 2016
Phylogenetic Gaussian Process Model for the Inference of Functionally Important Regions in Protein Tertiary Structures journal January 2014
Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization journal October 2017
TEPITOPEpan: Extending TEPITOPE for Peptide Binding Prediction Covering over 700 HLA-DR Molecules journal February 2012
Gaussian Process: A Promising Approach for the Modeling and Prediction of Peptide Binding Affinity to MHC Proteins journal July 2011
Structure-based Methods for Binding Mode and Binding Affinity Prediction for Peptide-MHC Complexes journal January 2019
Major Histocompatibility Complex (MHC) Class I and MHC Class II Proteins: Conformational Plasticity in Antigen Presentation journal March 2017
Structure Based Prediction of Neoantigen Immunogenicity journal August 2019
Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests journal July 2020
MHCII3D—Robust Structure Based Prediction of MHC II Binding Peptides journal December 2020
APE-Gen: A Fast Method for Generating Ensembles of Bound Peptide-MHC Conformations journal March 2019
NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data journal October 2017