DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Iterative random forests to discover predictive and stable high-order interactions

Journal Article · · Proceedings of the National Academy of Sciences of the United States of America
 [1];  [2];  [3];  [4]
  1. Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853,, Department of Statistical Science, Cornell University, Ithaca, NY 14853,, Data Driven Decisions Department, Preminon LLC, Antioch, CA 94531,
  2. Statistics Department, University of California, Berkeley, CA 94720,
  3. Data Driven Decisions Department, Preminon LLC, Antioch, CA 94531,, Statistics Department, University of California, Berkeley, CA 94720,, Centre for Computational Biology, School of Biosciences, University of Birmingham, Edgbaston B15 2TT, United Kingdom,, Molecular Ecosystems Biology Department, Biosciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,
  4. Data Driven Decisions Department, Preminon LLC, Antioch, CA 94531,, Statistics Department, University of California, Berkeley, CA 94720,, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720

Significance We developed a predictive, stable, and interpretable tool: the iterative random forest algorithm (iRF). iRF discovers high-order interactions among biomolecules with the same order of computational cost as random forests. We demonstrate the efficacy of iRF by finding known and promising interactions among biomolecules, of up to fifth and sixth order, in two data examples in transcriptional regulation and alternative splicing.

Sponsoring Organization:
USDOE
Grant/Contract Number:
SC0017069
OSTI ID:
1417528
Journal Information:
Proceedings of the National Academy of Sciences of the United States of America, Journal Name: Proceedings of the National Academy of Sciences of the United States of America Journal Issue: 8 Vol. 115; ISSN 0027-8424
Publisher:
Proceedings of the National Academy of SciencesCopyright Statement
Country of Publication:
United States
Language:
English

References (43)

Sequence Analysis Using Logic Regression journal January 2001
Bagging predictors journal August 1996
From gradients to stripes in Drosophila embryogenesis: filling in the gaps journal November 1996
Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation journal January 2012
eFORGE: A Tool for Identifying Cell Type-Specific Signal in Epigenomic Data journal November 2016
Transcriptional Enhancers in Animal Development and Evolution journal September 2010
Zelda Potentiates Morphogen Activity by Increasing Chromatin Accessibility journal June 2014
Random Forests journal January 2001
Mutations affecting segment number and polarity in Drosophila journal October 1980
CTCF: from insulators to alternative splicing regulation journal February 2012
The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila journal October 2008
An integrated encyclopedia of DNA elements in the human genome journal September 2012
Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers journal April 2015
Co-ChIP enables genome-wide mapping of histone mark co-occurrence at single-molecule resolution journal July 2016
Differential chromatin marking of introns and expressed exons by H3K36me3 journal February 2009
Processing the H3K36me3 signature journal March 2009
Bayesian inference of epistatic interactions in case-control studies journal August 2007
A U1 snRNP–specific assembly pathway reveals the SMN complex as a versatile hub for RNP exchange journal February 2016
DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila journal December 2012
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome journal January 2002
Multifactor-Dimensionality Reduction Reveals High-Order Interactions among Estrogen-Metabolism Genes in Sporadic Breast Cancer journal July 2001
Enriched random forests journal July 2008
Integrative annotation of chromatin elements from ENCODE data journal December 2012
Impacts of the ubiquitous factor Zelda on Bicoid-dependent DNA binding and transcription in Drosophila journal March 2014
Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs journal September 2012
ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia journal September 2012
Extensive cross-regulation of post-transcriptional regulatory networks in Drosophila journal August 2015
A balanced iterative random forest for gene selection from microarray data journal August 2013
Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions journal January 2009
Dynamic reprogramming of chromatin accessibility during Drosophila embryo development journal January 2011
Modeling gene expression using chromatin features in various cellular contexts journal January 2012
Sparse kernel canonical correlation analysis for discovery of nonlinear interactions in high-dimensional data journal February 2017
ChromNet: Learning the human chromatin network from all ENCODE ChIP-seq data journal April 2016
Predictive learning via rule ensembles journal September 2008
Forest Garrote journal January 2009
Node harvest journal December 2010
Measuring reproducibility of high-throughput experiments journal September 2011
ATP-dependent chromatin remodeling during mammalian development journal August 2016
Global Quantitative Modeling of Chromatin Factor Interactions journal March 2014
Zelda Binding in the Early Drosophila melanogaster Embryo Marks Regions Subsequently Activated at the Maternal-to-Zygotic Transition journal October 2011
A Broad Set of Chromatin Factors Influences Splicing journal September 2016
Stability journal September 2013
Computing away the magic? journal August 2013