DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy

Journal Article · · Proceedings of the National Academy of Sciences of the United States of America
 [1];  [2];  [3];  [3];  [3];  [3];  [3];  [3];  [3];  [3]; ORCiD logo [4];  [4];  [3];  [4];  [5]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Cornell Univ., Ithaca, NY (United States)
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  4. Univ. of California, Berkeley, CA (United States)
  5. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Univ. of Birmingham (United Kingdom)

Identifying functional enhancer elements in metazoan systems is a major challenge. Large-scale validation of enhancers predicted by ENCODE reveal false-positive rates of at least 70%. We used the pregrastrula-patterning network of Drosophila melanogaster to demonstrate that loss in accuracy in held-out data results from heterogeneity of functional signatures in enhancer elements. We show that at least two classes of enhancers are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, greater than 98% prediction accuracy can be achieved in a balanced, completely held-out test set. The class of well-predicted elements is composed predominantly of enhancers driving multistage segmentation patterns, which we designate segmentation driving enhancers (SDE). Prediction is driven by the DNA occupancy of early developmental transcription factors, with almost no additional power derived from histone modifications. We further show that improved accuracy is not a property of a particular prediction method: after conditioning on the SDE set, naïve Bayes and logistic regression perform as well as more sophisticated tools. Applying this method to a genome-wide scan, we predict 1,640 SDEs that cover 1.6% of the genome. An analysis of 32 SDEs using whole-mount embryonic imaging of stably integrated reporter constructs chosen throughout our prediction rank-list showed >90% drove expression patterns. We achieved 86.7% precision on a genome-wide scan, with an estimated recall of at least 98%, indicating high accuracy and completeness in annotating this class of functional elements.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1559177
Journal Information:
Proceedings of the National Academy of Sciences of the United States of America, Journal Name: Proceedings of the National Academy of Sciences of the United States of America Journal Issue: 3 Vol. 116; ISSN 0027-8424
Publisher:
National Academy of Sciences, Washington, DC (United States)Copyright Statement
Country of Publication:
United States
Language:
English

References (128)

The origin of pattern and polarity in the Drosophila embryo journal January 1992
Unraveling Epigenetic Landscapes: The Enigma of Enhancers journal February 2011
Comm Sorts Robo to Control Axon Guidance at the Drosophila Midline journal August 2002
Mutations affecting segment number and polarity in Drosophila journal October 1980
The nuclear receptor homologue Ftz-F1 and the homeodomain protein Ftz are mutually dependent cofactors journal February 1997
An atlas of active enhancers across human cell types and tissues journal March 2014
Genome-scale functional characterization of Drosophila developmental enhancers in vivo journal June 2014
Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers journal April 2015
Core promoters across the genome journal February 2017
Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources journal December 2008
Determination of gene expression patterns using high-throughput RNA in situ hybridization to whole-mount Drosophila embryos journal April 2009
EnhancerPred: a predictor for discovering enhancers based on the combination and selection of multiple features journal December 2016
An optimized transgenesis system for Drosophila using germ-line-specific  C31 integrases journal February 2007
Unraveling the score of the enhancer symphony journal December 2010
Defining functional DNA elements in the human genome journal April 2014
REDfly: a Regulatory Element Database for Drosophila journal November 2005
BEDTools: a flexible suite of utilities for comparing genomic features journal January 2010
Discover regulatory DNA elements using chromatin signatures and artificial neural network journal May 2010
iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k -tuple nucleotide composition journal October 2015
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone journal February 2017
Maternal-Zygotic gene Interactions During Formation of the Dorsoventral Pattern in Drosophila Embryos journal November 1983
A space-time process model for the evolution of DNA sequences. journal February 1995
The UCSC Table Browser data retrieval tool journal January 2004
FlyBase: genomes by the dozen journal January 2007
REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila journal December 2007
Integrative annotation of chromatin elements from ENCODE data journal December 2012
Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines journal February 2012
DEEP: a general computational framework for predicting enhancers journal November 2014
i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly journal April 2015
A Hidden Markov Model approach to variation among sites in rate of evolution journal January 1996
Groucho acts as a corepressor for a subset of negative regulators, including Hairy and Engrailed journal November 1997
The torso response element binds GAGA and NTF-1/Elf-1, and regulates tailless by relief of repression. journal December 1995
Discriminative prediction of mammalian enhancers from DNA sequence journal August 2011
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes journal August 2005
Global Analysis of Short RNAs Reveals Widespread Promoter-Proximal Stalling and Arrest of Pol II in Drosophila journal December 2009
Genetics of Drosophila Embryogenesis journal December 1985
Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions journal January 2009
The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding journal January 2011
Dynamic reprogramming of chromatin accessibility during Drosophila embryo development journal January 2011
On the comparison of regulatory sequences with multiple resolution Entropic Profiles journal March 2016
Enhancers reside in a unique epigenetic environment during early zebrafish development journal July 2016
Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility journal April 2015
Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm journal February 2008
Large-scale turnover of functional transcription factor binding sites in Drosophila journal January 2005
RFECS: A Random-Forest Based Algorithm for Enhancer Identification from Chromatin State journal March 2013
Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Drosophila Development journal February 2011
Commissureless Regulation of Axon Outgrowth across the Midline Is Independent of Rab Function journal May 2013
Area under Precision-Recall Curves for Weighted and Unweighted Data journal March 2014
Weighted k-Nearest-Neighbor Techniques and Ordinal Classification text January 2004
Establishment of regions of genomic activity during the Drosophila maternal to zygotic transition journal October 2014
A random forest guided tour journal April 2016
The origin of pattern and polarity in the Drosophila embryo journal January 1992
Comm Sorts Robo to Control Axon Guidance at the Drosophila Midline journal August 2002
Grainyhead and Zelda compete for binding to the promoters of the earliest-expressed Drosophila genes journal September 2010
Random Forests journal January 2001
Mutations affecting segment number and polarity in Drosophila journal October 1980
The nuclear receptor homologue Ftz-F1 and the homeodomain protein Ftz are mutually dependent cofactors journal February 1997
Combinatorial binding predicts spatio-temporal cis-regulatory activity journal November 2009
A unique chromatin signature uncovers early developmental enhancers in humans journal December 2010
An atlas of active enhancers across human cell types and tissues journal March 2014
Genome-scale functional characterization of Drosophila developmental enhancers in vivo journal June 2014
Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers journal April 2015
Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution journal December 2016
Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources journal December 2008
Determination of gene expression patterns using high-throughput RNA in situ hybridization to whole-mount Drosophila embryos journal April 2009
miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides journal September 2020
Adipose-derived mesenchymal stem cells differentiate into heterogeneous cancer-associated fibroblasts in a stroma-rich xenograft model journal February 2021
PEDLA: predicting enhancers with a deep learning-based algorithmic framework journal June 2016
EnhancerPred: a predictor for discovering enhancers based on the combination and selection of multiple features journal December 2016
An optimized transgenesis system for Drosophila using germ-line-specific  C31 integrases journal February 2007
Tools for neuroanatomy and neurogenetics in Drosophila journal July 2008
Histone H3K27ac separates active from poised enhancers and predicts developmental state journal November 2010
DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila journal December 2012
Defining functional DNA elements in the human genome journal April 2014
REDfly: a Regulatory Element Database for Drosophila journal November 2005
BEDTools: a flexible suite of utilities for comparing genomic features journal January 2010
Discover regulatory DNA elements using chromatin signatures and artificial neural network journal May 2010
Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser journal November 2013
iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k -tuple nucleotide composition journal October 2015
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone journal February 2017
Maternal-Zygotic gene Interactions During Formation of the Dorsoventral Pattern in Drosophila Embryos journal November 1983
A space-time process model for the evolution of DNA sequences. journal February 1995
Construction of Transgenic Drosophila by Using the Site-Specific Integrase From Phage φC31 journal April 2004
The UCSC Table Browser data retrieval tool journal January 2004
FlyBase: genomes by the dozen journal January 2007
REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila journal December 2007
Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists journal November 2008
REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila journal October 2010
Integrative annotation of chromatin elements from ENCODE data journal December 2012
Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines journal February 2012
DEEP: a general computational framework for predicting enhancers journal November 2014
i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly journal April 2015
A Hidden Markov Model approach to variation among sites in rate of evolution journal January 1996
Activation of transcription in Drosophila embryos is a gradual process mediated by the nucleocytoplasmic ratio. journal May 1996
Groucho acts as a corepressor for a subset of negative regulators, including Hairy and Engrailed journal November 1997
A core transcriptional network for early mesoderm development in Drosophila melanogaster journal February 2007
The torso response element binds GAGA and NTF-1/Elf-1, and regulates tailless by relief of repression. journal December 1995
Discriminative prediction of mammalian enhancers from DNA sequence journal August 2011
High-throughput functional testing of ENCODE segmentation predictions journal July 2014
The Release 6 reference sequence of the Drosophila melanogaster genome journal January 2015
The Human Genome Browser at UCSC journal May 2002
The Human Genome Browser at UCSC journal May 2002
The Human Genome Browser at UCSC journal May 2002
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes journal August 2005
Global Analysis of Short RNAs Reveals Widespread Promoter-Proximal Stalling and Arrest of Pol II in Drosophila journal December 2009
Sex determination and dosage compensation: lessons from flies and worms journal May 1994
Genetics of Drosophila Embryogenesis journal December 1985
Indian Hedgehog: A Mechanotransduction Mediator in Condylar Cartilage journal May 2004
Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions journal January 2009
The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding journal January 2011
Dynamic reprogramming of chromatin accessibility during Drosophila embryo development journal January 2011
On the comparison of regulatory sequences with multiple resolution Entropic Profiles journal March 2016
Enhancers reside in a unique epigenetic environment during early zebrafish development journal July 2016
Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility journal April 2015
Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm journal February 2008
Large-Scale Turnover of Functional Transcription Factor Binding Sites in Drosophila journal January 2006
Large-scale turnover of functional transcription factor binding sites in Drosophila journal January 2005
RFECS: A Random-Forest Based Algorithm for Enhancer Identification from Chromatin State journal March 2013
Integrating Diverse Datasets Improves Developmental Enhancer Prediction journal June 2014
Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Drosophila Development journal February 2011
Commissureless Regulation of Axon Outgrowth across the Midline Is Independent of Rab Function journal May 2013
Area under Precision-Recall Curves for Weighted and Unweighted Data journal March 2014
DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications journal June 2015
Inducible chromatin priming is associated with the establishment of immunological memory in T cells journal January 2016
Construction of Transgenic Drosophila by Using the Site-Specific Integrase From Phage  C31 journal April 2004
An atlas of active enhancers across human cell types and tissues text January 2014
Enhancers reside in a unique epigenetic environment during early zebrafish development collection January 2016
Establishment of regions of genomic activity during the Drosophila maternal to zygotic transition journal October 2014

Cited By (2)

Additional file 1 of A map of cis-regulatory modules and constituent transcription factor binding sites in 80% of the mouse genome dataset January 2022
How to study enhancers in non-traditional insect models journal February 2020