DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks

Journal Article · · Nucleic Acids Research
ORCiD logo [1];  [2]
  1. Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
  2. Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA, Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA

Abstract Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting non-coding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of expression variation, with a suite of biophysical and machine learning models. We performed rigorous comparisons of thermodynamics-based models implementing different mechanisms of activation, repression and cooperativity. Moreover, we developed a convolutional neural network (CNN) model, called CoNSEPT, that learns enhancer ‘grammar’ in an unbiased manner. CoNSEPT is the first general-purpose CNN tool for predicting enhancer function in varying conditions, such as different cell types and experimental conditions, and we show that such complex models can suggest interpretable mechanisms. We found model-based evidence for mechanisms previously established for the studied system, including cooperative activation and short-range repression. The data also favored one hypothesized activation mechanism over another and suggested an intriguing role for a direct, distance-independent repression mechanism. Our modeling shows that while fundamentally different models can yield similar fits to data, they vary in their utility for mechanistic inference. CoNSEPT is freely available at: https://github.com/PayamDiba/CoNSEPT.

Sponsoring Organization:
USDOE
Grant/Contract Number:
SC0018420
OSTI ID:
1819777
Journal Information:
Nucleic Acids Research, Journal Name: Nucleic Acids Research Journal Issue: 18 Vol. 49; ISSN 0305-1048
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (58)

Multiple modes of dorsal-bHLH transcriptional synergy in the Drosophila embryo. journal May 1995
The origin of pattern and polarity in the Drosophila embryo journal January 1992
Predicting Gene Expression from Sequence journal April 2004
The MADF–BESS domain factor Dip3 potentiates synergistic activation by Dorsal and Twist journal October 2002
CtBP, an Unconventional Transcriptional Corepressor in Development and Oncogenesis journal February 2002
Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity journal September 2014
Information Integration and Energy Expenditure in Gene Regulation journal June 2016
The Genetics of Transcription Factor DNA Binding Variation journal July 2016
A Simple Grammar Defines Activating and Repressing cis-Regulatory Elements in Photoreceptors journal October 2016
Quantitative Measurement and Thermodynamic Modeling of Fused Enhancers Support a Two-Tiered Mechanism for Interpreting Regulatory DNA journal October 2017
Clinical and Genomic Crosstalk between Glucocorticoid Receptor and Estrogen Receptor α In Endometrial Cancer journal March 2018
Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks journal May 2020
A Systematic Ensemble Approach to Thermodynamic Modeling of Gene Expression from Sequence Data journal December 2015
Structural Rules and Complex Regulatory Circuitry Constrain Expression of a Notch- and EGFR-Regulated Eye Enhancer journal March 2010
Deciphering a transcriptional regulatory code: modeling short‐range repression in the Drosophila embryo journal January 2010
Predicting expression patterns from regulatory sequence in Drosophila segmentation journal January 2008
Analysis of combinatorial cis-regulation in synthetic and genomic promoters journal November 2008
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning journal July 2015
Quantitatively predictable control of Drosophila transcriptional enhancers in vivo with engineered transcription factors journal February 2016
Quantitative and predictive model of transcriptional control of the Drosophila melanogaster even skipped gene journal September 2006
Predicting effects of noncoding variants with deep learning–based sequence model journal August 2015
Transcription factors: from enhancer binding to developmental control journal August 2012
Deep learning-based enhancement of epigenomics data with AtacWorks journal March 2021
Base-resolution models of transcription-factor binding reveal soft motif syntax journal February 2021
Dual regulation by the Hunchback gradient in the Drosophila embryo journal February 2008
How the Dorsal gradient works: Insights from postgenome technologies journal December 2008
CtBP-dependent activities of the short-range Giant repressor in the Drosophila embryo journal May 2001
Syntax compensates for poor binding sites to encode tissue specificity of developmental enhancers journal May 2016
Nonequilibrium models of optimal enhancer function journal December 2020
Functional Interaction between the Drosophila Knirps Short Range Transcriptional Repressor and RPD3 Histone Deacetylase journal December 2005
Fully interpretable deep learning model of transcriptional control journal July 2020
dCtBP mediates transcriptional repression by Knirps, Krüppel and Snail in the Drosophila embryo journal December 1998
Simulations of Enhancer Evolution Provide Mechanistic Insights into Gene Regulation journal October 2013
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development journal July 2013
Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism journal March 2015
Short-range repression permits multiple enhancers to function autonomously within a complex promoter. journal August 1994
Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks journal May 2016
Gene Regulation by Transcription Factors and MicroRNAs journal March 2008
Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq journal January 2013
Interaction of Short-Range Repressors with Drosophila CtBP in the Embryo journal April 1998
A direct contact between the dorsal rel homology domain and Twist may mediate transcriptional synergy journal June 1997
CtBP-Independent Repression in the Drosophila Embryo journal June 2003
cis-Regulatory Logic of Short-Range Transcriptional Repression in Drosophila melanogaster journal May 2005
Two-Layer Mathematical Modeling of Gene Expression: Incorporating DNA-Level Information and System Dynamics journal January 2013
Transcriptional Control in Drosophila journal January 2003
A framework for modelling gene regulation which accommodates non-equilibrium mechanisms journal December 2014
DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning journal April 2017
Quantitative contributions of CtBP-dependent and -independent repression activities of Knirps journal May 2004
Evolution Acts on Enhancer Organization to Fine-Tune Gradient Threshold Readouts journal November 2008
Quantitative Analysis of the Drosophila Segmentation Regulatory Network Using Pattern Generating Potentials journal August 2010
Enhancer Responses to Similarly Distributed Antagonistic Gradients in Development journal May 2007
Thermodynamics-Based Models of Transcriptional Regulation by Enhancers: The Roles of Synergistic Activation, Cooperative Binding and Short-Range Repression journal September 2010
An information theoretic treatment of sequence-to-expression modeling journal September 2018
Rearrangements of 2.5 Kilobases of Noncoding DNA from the Drosophila even-skipped Locus Define Predictive Rules of Genomic cis-Regulatory Logic journal February 2013
Mathematical modeling of gene expression: a guide for the perplexed biologist journal January 2011
Cellular resolution models for even skipped regulation in the entire Drosophila embryo journal August 2013
Quantitative perturbation-based analysis of gene expression predicts enhancer activity in early Drosophila embryo journal May 2016
Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells journal February 2020