Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types

Journal Article · · Cell Reports
 [1];  [2];  [3];  [1];  [3];  [1];  [1];  [1];  [3];  [3];  [1];  [4];  [4];  [5];  [1];  [5];  [5];  [6];  [6];  [1] more »;  [2] « less
  1. Institute for Systems Biology, Seattle, WA (United States)
  2. Univ. of Maryland School of Medicine, Baltimore, MD (United States)
  3. Univ. of Chicago, IL (United States)
  4. Mayo Clinic, Jacksonville, FL (United States)
  5. Univ. of Southern California, Los Angeles, CA (United States)
  6. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
National Human Genome Research Institute (NHGRI); National Institute of General Medical Sciences (NIGMS); National Institute of Mental Health (NIMH); National Institute on Aging (NIA); National Institutes of Health (NIH); USDOE
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1774294
Journal Information:
Cell Reports, Journal Name: Cell Reports Journal Issue: 7 Vol. 32; ISSN 2211-1247
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (53)

An expansive human regulatory lexicon encoded in transcription factor footprints journal September 2012
F-Seq: a feature density estimator for high-throughput sequence tags journal September 2008
The Sequence Alignment/Map format and SAMtools journal June 2009
FIMO: scanning for occurrences of a given motif journal February 2011
Detection of active transcription factor binding sites with the combination of DNase hypersensitivity and histone modifications journal August 2014
BinDNase: a discriminatory approach for transcription factor binding prediction using DNase I hypersensitivity data journal May 2015
DNAase footprinting a simple method for the detection of protein-DNA binding specificity journal January 1978
Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data journal September 2014
I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets conference December 2016
Efficient and Secure Transfer, Synchronization, and Sharing of Big Data journal September 2014
Quantifying similarity between motifs journal January 2007
Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric journal June 2017
Experiences building Globus Genomics: a next-generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services: EXPERIENCES BUILDING GLOBUS GENOMICS
  • Madduri, Ravi K.; Sulakhe, Dinanath; Lacinski, Lukasz
  • Concurrency and Computation: Practice and Experience, Vol. 26, Issue 13 https://doi.org/10.1002/cpe.3274
journal April 2014
Partitioning Heritability of Regulatory and Cell-Type-Specific Variants across 11 Common Diseases journal November 2014
Circuitry and Dynamics of Human Transcription Factor Regulatory Networks journal September 2012
Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay journal June 2016
A Genetic Variant Associated with Five Vascular Diseases Is a Distal Regulator of Endothelin-1 Gene Expression journal July 2017
The Dynamic Landscape of Open Chromatin during Human Cortical Neurogenesis journal January 2018
Bivariate Genomic Footprinting Detects Changes in Transcription Factor Activity journal May 2017
Genome-Scale Transcriptional Regulatory Network Models of Psychiatric and Neurodegenerative Disorders journal February 2019
DNase Footprint Signatures Are Dictated by Factor Dynamics and DNA Sequence journal October 2014
DNase I sensitivity QTLs are a major determinant of human expression variation journal February 2012
An integrated encyclopedia of DNA elements in the human genome journal September 2012
Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease journal October 2012
Genetics of rheumatoid arthritis contributes to biology and drug discovery journal December 2013
Biological insights from 108 schizophrenia-associated genetic loci journal July 2014
Genome-wide association study identifies 74 loci associated with educational attainment journal May 2016
Genetic effects on gene expression across human tissues journal October 2017
Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape journal January 2014
International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways journal September 2015
Partitioning heritability by functional annotation using genome-wide association summary statistics journal September 2015
Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus journal October 2015
Multiple common variants for celiac disease influencing immune gene expression journal February 2010
FTO Obesity Variant Circuitry and Adipocyte Browning in Humans journal September 2015
Epigenetic priors for identifying active transcription factor binding sites journal November 2011
The UCSC Genome Browser Database: update 2006 journal January 2006
Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data journal September 2013
TFClass: a classification of human transcription factors and their rodent orthologs journal October 2014
Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection journal October 2014
HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease journal December 2015
Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data journal November 2010
High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells journal November 2010
Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints journal September 2017
Efficient and Secure Transfer, Synchronization, and Sharing of Big Data journal September 2014
Systematic Localization of Common Disease-Associated Variation in Regulatory DNA journal September 2012
Innate Immune Activity Conditions the Effect of Regulatory Variants upon Monocyte Gene Expression journal March 2014
Software for Computing and Annotating Genomic Ranges journal August 2013
A Genome-Wide Meta-Analysis of Six Type 1 Diabetes Cohorts Identifies Multiple Associated Loci journal September 2011
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? journal February 2016
A Comparison of Peak Callers Used for DNase-Seq Data journal May 2014
Reproducible big data science: A case study in continuous FAIRness journal April 2019
Transcriptional regulatory networks underlying gene expression changes in Huntington's disease journal March 2018
Identifying Causal Variants at Loci with Multiple Signals of Association journal August 2014

Cited By (2)

Reproducible Big Data Science: A Case Study In Continuous Fairness text January 2018
Reproducible Big Data Science: A Case Study In Continuous Fairness text January 2018

Similar Records

MZF-1 and DbpA interact with DNase I hypersensitive sites that correlate with expression of the human MUC1 mucin gene
Journal Article · Mon Aug 01 00:00:00 EDT 2005 · Experimental Cell Research · OSTI ID:20717632

A web-based tool for the prediction of rice transcription factor function
Journal Article · Wed Jun 05 20:00:00 EDT 2019 · Database · OSTI ID:2469652

Genome-wide Transcription Factor DNA Binding Sites and Gene Regulatory Networks in Clostridium thermocellum
Dataset · Mon Apr 19 00:00:00 EDT 2021 · OSTI ID:1778964