DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: FUN-PROSE: A deep learning approach to predict condition-specific gene expression in fungi

Journal Article · · PLoS Computational Biology (Online)
ORCiD logo [1];  [2];  [3];  [4]
  1. University of Illinois Urbana-Champaign, IL (United States); Carl R. Woese Institute for Genomic Biology, Urbana, IL (United States)
  2. University of Illinois Urbana-Champaign, IL (United States); Carl R. Woese Institute for Genomic Biology, Urbana, IL (United States); The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA (United States)
  3. Carl R. Woese Institute for Genomic Biology, Urbana, IL (United States); University of Illinois Urbana-Champaign, IL (United States)
  4. University of Illinois Urbana-Champaign, IL (United States); Carl R. Woese Institute for Genomic Biology, Urbana, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)

mRNA levels of all genes in a genome is a critical piece of information defining the overall state of the cell in a given environmental condition. Being able to reconstruct such condition-specific expression in fungal genomes is particularly important to metabolically engineer these organisms to produce desired chemicals in industrially scalable conditions. Most previous deep learning approaches focused on predicting the average expression levels of a gene based on its promoter sequence, ignoring its variation across different conditions. Here we present FUN-PROSE—a deep learning model trained to predict differential expression of individual genes across various conditions using their promoter sequences and expression levels of all transcription factors. We train and test our model on three fungal species and get the correlation between predicted and observed condition-specific gene expression as high as 0.85. We then interpret our model to extract promoter sequence motifs responsible for variable expression of individual genes. We also carried out input feature importance analysis to connect individual transcription factors to their gene targets. A sizeable fraction of both sequence motifs and TF-gene interactions learned by our model agree with previously known biological information, while the rest corresponds to either novel biological facts or indirect correlations.

Research Organization:
University of Illinois Urbana-Champaign, IL (United States); Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
SC0018420; AC02-06-CH11357
OSTI ID:
2212841
Alternate ID(s):
OSTI ID: 2311189
Journal Information:
PLoS Computational Biology (Online), Vol. 19, Issue 11; ISSN 1553-7358
Publisher:
Public Library of ScienceCopyright Statement
Country of Publication:
United States
Language:
English

References (50)

Prediction of condition-specific regulatory genes using machine learning journal April 2020
The transcriptional response to alkaline pH in Saccharomyces cerevisiae: evidence for calcium-mediated signalling: Alkaline response in yeast journal December 2002
DBD––taxonomically broad transcription factor predictions: new content and functionality journal December 2007
Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks journal May 2020
Function Analysis of MBF1, a Factor Involved in the Response to Amino Acid Starvation and Virulence in Candida albicans journal March 2021
Transcription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters journal September 2011
Functional and Mechanistic Diversity of Distal Transcription Enhancers journal February 2011
Fast unfolding of communities in large networks journal October 2008
Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure journal December 2020
Computational architecture of the yeast regulatory network journal November 2005
The regulatory and transcriptional landscape associated with carbon utilization in a filamentous fungus journal February 2020
A functional selection model explains evolutionary robustness despite plasticity in regulatory networks journal January 2012
Multi-kinase control of environmental stress responsive transcription journal March 2020
The Function and Properties of the Azf1 Transcriptional Regulator Change with Growth Conditions in Saccharomyces cerevisiae journal February 2006
A high-resolution protein architecture of the budding yeast genome journal March 2021
Cis-regulatory elements explain most of the mRNA stability variation across genes in yeast journal August 2017
Population genomics shows no distinction between pathogenic Candida krusei and environmental Pichia kudriavzevii: One species, four names journal July 2018
Comparing protein abundance and mRNA expression levels on a genomic scale journal August 2003
Yeast Nhp6A/B and Mammalian Hmgb1 Facilitate the Maintenance of Genome Stability journal January 2005
Learning causal networks using inducible transcription factors and transcriptome‐wide time series journal March 2020
Model-based transcriptome engineering promotes a fermentative transcriptional state in yeast journal November 2016
Reconstruction of a Global Transcriptional Regulatory Network for Control of Lipid Metabolism in Yeast by Using Chromatin Immunoprecipitation with Lambda Exonuclease Digestion journal July 2018
Distributed and dynamic intracellular organization of extracellular information journal May 2018
Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses journal February 2020
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth journal July 2020
Gcn4p, a Master Regulator of Gene Expression, Is Controlled at Multiple Levels by Diverse Signals of Starvation and Stress journal February 2002
Base-resolution models of transcription-factor binding reveal soft motif syntax journal February 2021
The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae journal October 2013
Genome-wide expression monitoring in Saccharomyces cerevisiae journal December 1997
Correlation between Protein and mRNA Abundance in Yeast journal March 1999
On the Dependency of Cellular Protein Levels on mRNA Abundance journal April 2016
Gephi: An Open Source Software for Exploring and Manipulating Networks journal March 2009
The Reference Genome Sequence of Saccharomyces cerevisiae : Then and Now journal December 2013
The genome sequence of the filamentous fungus Neurospora crassa journal April 2003
STP1 , a gene involved in pre-tRNA processing in yeast, is important for amino-acid uptake and transcription of the permease gene BAP2 journal March 1997
The specialized cytosolic J-protein, Jjj1, functions in 60S ribosomal subunit biogenesis journal January 2007
A brief history of synthetic biology journal April 2014
Quantifying similarity between motifs journal January 2007
Transcription factor–DNA binding: beyond binding site motifs journal April 2017
Prediction of regulatory motifs from human Chip-sequencing data using a deep learning framework journal August 2019
FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data journal August 2019
Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors journal April 2014
BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in Biological Networks journal June 2005
FTFD: an informatics pipeline supporting phylogenomic analysis of fungal transcription factors journal February 2008
Complementary Profiling of Gene Expression at the Transcriptome and Proteome Levels in Saccharomyces cerevisiae journal April 2002
The evolution, evolvability and engineering of gene regulatory DNA journal March 2022
Improving representations of genomic sequence motifs in convolutional networks with exponential activations journal February 2021
Profiling condition-specific, genome-wide regulation of mRNA stability in yeast journal November 2005
Engineering Cellular Metabolism journal March 2016
The impact of oxygen on the transcriptome of recombinant S. cerevisiae and P. pastoris - a comparative analysis journal May 2011