DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Prior knowledge driven Granger causality analysis on gene regulatory network discovery

Abstract

Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be relatedmore » to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.« less

Authors:
 [1];  [2];  [2]
  1. Stony Brook Univ., NY (United States); Brookhaven National Lab. (BNL), Upton, NY (United States)
  2. Brookhaven National Lab. (BNL), Upton, NY (United States)
Publication Date:
Research Org.:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1259268
Resource Type:
Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; Time series; Gene expression data; Granger causality; Gene regulatory networks

Citation Formats

Yao, Shun, Yoo, Shinjae, and Yu, Dantong. Prior knowledge driven Granger causality analysis on gene regulatory network discovery. United States: N. p., 2015. Web. doi:10.1186/s12859-015-0710-1.
Yao, Shun, Yoo, Shinjae, & Yu, Dantong. Prior knowledge driven Granger causality analysis on gene regulatory network discovery. United States. https://doi.org/10.1186/s12859-015-0710-1
Yao, Shun, Yoo, Shinjae, and Yu, Dantong. Fri . "Prior knowledge driven Granger causality analysis on gene regulatory network discovery". United States. https://doi.org/10.1186/s12859-015-0710-1. https://www.osti.gov/servlets/purl/1259268.
@article{osti_1259268,
title = {Prior knowledge driven Granger causality analysis on gene regulatory network discovery},
author = {Yao, Shun and Yoo, Shinjae and Yu, Dantong},
abstractNote = {Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.},
doi = {10.1186/s12859-015-0710-1},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United States},
year = {Fri Aug 28 00:00:00 EDT 2015},
month = {Fri Aug 28 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 12 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

A decade’s perspective on DNA sequencing technology
journal, February 2011


Bioinformatics challenges of new sequencing technology
journal, March 2008


Gene regulatory network inference: Data integration in dynamic models—A review
journal, April 2009


NCBI GEO: archive for functional genomics data sets—update
journal, November 2012

  • Barrett, Tanya; Wilhite, Stephen E.; Ledoux, Pierre
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1193

Inferring gene regulatory networks from time series data using the minimum description length principle
journal, July 2006


Boolean network inference from time series data incorporating prior biological knowledge
journal, January 2012


Characterizing Dynamic Changes in the Human Blood Transcriptional Network
journal, February 2010


Granger causality vs. dynamic Bayesian network inference: a comparative study
journal, April 2009


Fast Bayesian inference for gene regulatory networks using ScanBMA
journal, January 2014

  • Young, William; Raftery, Adrian E.; Yeung, Ka
  • BMC Systems Biology, Vol. 8, Issue 1
  • DOI: 10.1186/1752-0509-8-47

Reverse engineering of regulatory networks in human B cells
journal, March 2005

  • Basso, Katia; Margolin, Adam A.; Stolovitzky, Gustavo
  • Nature Genetics, Vol. 37, Issue 4
  • DOI: 10.1038/ng1532

TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach
journal, January 2010

  • Zoppoli, Pietro; Morganella, Sandro; Ceccarelli, Michele
  • BMC Bioinformatics, Vol. 11, Issue 1
  • DOI: 10.1186/1471-2105-11-154

Inference of gene regulatory networks from time series by Tsallis entropy
journal, May 2011

  • Lopes, Fabrício Martins; de Oliveira, Evaldo A.; Cesar, Roberto M.
  • BMC Systems Biology, Vol. 5, Issue 1
  • DOI: 10.1186/1752-0509-5-61

Investigating Causal Relations by Econometric Models and Cross-spectral Methods
journal, August 1969


Testing for causality
journal, January 1980


Causality and pathway search in microarray time series experiment
journal, December 2006


Granger Causality Analysis of Human Cell-Cycle Gene Expression Profiles
journal, January 2010

  • Nagarajan, Radhakrishnan; Upreti, Meenakshi
  • Statistical Applications in Genetics and Molecular Biology, Vol. 9, Issue 1
  • DOI: 10.2202/1544-6115.1555

Application of Granger causality to gene regulatory network discovery
conference, August 2012

  • Tam, Gary Hak Fui; Chang, Chunqi; Hung, Yeung Sam
  • 2012 IEEE 6th International Conference on Systems Biology (ISB)
  • DOI: 10.1109/ISB.2012.6314142

Grouped graphical Granger modeling for gene expression regulatory networks discovery
journal, May 2009


Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge
journal, February 2007

  • Geier, Florian; Timmer, Jens; Fleck, Christian
  • BMC Systems Biology, Vol. 1, Issue 1
  • DOI: 10.1186/1752-0509-1-11

Identification of Genes Periodically Expressed in the Human Cell Cycle and Their Expression in Tumors
journal, June 2002

  • Whitfield, Michael L.; Sherlock, Gavin; Saldanha, Alok J.
  • Molecular Biology of the Cell, Vol. 13, Issue 6
  • DOI: 10.1091/mbc.02-02-0030

Ridge Regression: Biased Estimation for Nonorthogonal Problems
journal, February 1970


Regression Shrinkage and Selection Via the Lasso
journal, January 1996


The lasso problem and uniqueness
journal, January 2013

  • Tibshirani, Ryan J.
  • Electronic Journal of Statistics, Vol. 7, Issue 0
  • DOI: 10.1214/13-EJS815

Regularization Paths for Generalized Linear Models via Coordinate Descent
journal, January 2010

  • Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert
  • Journal of Statistical Software, Vol. 33, Issue 1
  • DOI: 10.18637/jss.v033.i01

Regression Modeling Strategies
book, January 2001


Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks
journal, May 2010

  • Yan, K. -K.; Fang, G.; Bhardwaj, N.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 20
  • DOI: 10.1073/pnas.0914771107

Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
journal, November 2011


Architecture of the human regulatory network derived from ENCODE data
journal, September 2012

  • Gerstein, Mark B.; Kundaje, Anshul; Hariharan, Manoj
  • Nature, Vol. 489, Issue 7414
  • DOI: 10.1038/nature11245

Regularization and variable selection via the elastic net
journal, April 2005


A MATLAB toolbox for Granger causal connectivity analysis
journal, February 2010


minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information
journal, October 2008

  • Meyer, Patrick E.; Lafitte, Frédéric; Bontempi, Gianluca
  • BMC Bioinformatics, Vol. 9, Issue 1
  • DOI: 10.1186/1471-2105-9-461

Reverse engineering gene networks using singular value decomposition and robust regression
journal, April 2002

  • Yeung, M. K. S.; Tegner, J.; Collins, J. J.
  • Proceedings of the National Academy of Sciences, Vol. 99, Issue 9
  • DOI: 10.1073/pnas.092576199

Singular value decomposition and least squares solutions
journal, April 1970

  • Golub, G. H.; Reinsch, C.
  • Numerische Mathematik, Vol. 14, Issue 5
  • DOI: 10.1007/BF02163027

Genetic reconstruction of a functional transcriptional regulatory network
journal, April 2007

  • Hu, Zhanzhi; Killion, Patrick J.; Iyer, Vishwanath R.
  • Nature Genetics, Vol. 39, Issue 5
  • DOI: 10.1038/ng2012

An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007


High-resolution DNA-binding specificity analysis of yeast transcription factors
journal, January 2009

  • Zhu, C.; Byers, K. J. R. P.; McCord, R. P.
  • Genome Research, Vol. 19, Issue 4
  • DOI: 10.1101/gr.090233.108

Cytoscape 2.8: new features for data integration and network visualization
journal, December 2010


SGD: Saccharomyces Genome Database
journal, January 1998


The Shannon sampling theorem—Its various extensions and applications: A tutorial review
journal, January 1977


Gene regulatory network inference: Data integration in dynamic models—A review
journal, April 2009


A MATLAB toolbox for Granger causal connectivity analysis
journal, February 2010


Bioinformatics challenges of new sequencing technology
journal, March 2008


A decade’s perspective on DNA sequencing technology
journal, February 2011


Reverse engineering of regulatory networks in human B cells
journal, March 2005

  • Basso, Katia; Margolin, Adam A.; Stolovitzky, Gustavo
  • Nature Genetics, Vol. 37, Issue 4
  • DOI: 10.1038/ng1532

Genetic reconstruction of a functional transcriptional regulatory network
journal, April 2007

  • Hu, Zhanzhi; Killion, Patrick J.; Iyer, Vishwanath R.
  • Nature Genetics, Vol. 39, Issue 5
  • DOI: 10.1038/ng2012

Ontology-driven integrative analysis of omics data through Onassis
journal, January 2020


Challenges and opportunities for strain verification by whole-genome sequencing
journal, April 2020


Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks
journal, May 2010

  • Yan, K. -K.; Fang, G.; Bhardwaj, N.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 20
  • DOI: 10.1073/pnas.0914771107

Reverse engineering gene networks using singular value decomposition and robust regression
journal, April 2002

  • Yeung, M. K. S.; Tegner, J.; Collins, J. J.
  • Proceedings of the National Academy of Sciences, Vol. 99, Issue 9
  • DOI: 10.1073/pnas.092576199

Ridge Regression: Biased Estimation for Nonorthogonal Problems
journal, February 1970


Grouped graphical Granger modeling for gene expression regulatory networks discovery
journal, May 2009


Cytoscape 2.8: new features for data integration and network visualization
journal, December 2010


NCBI GEO: archive for functional genomics data sets—update
journal, November 2012

  • Barrett, Tanya; Wilhite, Stephen E.; Ledoux, Pierre
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1193

Single-Cell Transcriptomic Atlas of the Human Endometrium During the Menstrual Cycle
journal, February 2022


High-resolution DNA-binding specificity analysis of yeast transcription factors
journal, January 2009

  • Zhu, C.; Byers, K. J. R. P.; McCord, R. P.
  • Genome Research, Vol. 19, Issue 4
  • DOI: 10.1101/gr.090233.108

Granger causality vs. dynamic Bayesian network inference: a comparative study
journal, April 2009


TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach
journal, January 2010

  • Zoppoli, Pietro; Morganella, Sandro; Ceccarelli, Michele
  • BMC Bioinformatics, Vol. 11, Issue 1
  • DOI: 10.1186/1471-2105-11-154

minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information
journal, October 2008

  • Meyer, Patrick E.; Lafitte, Frédéric; Bontempi, Gianluca
  • BMC Bioinformatics, Vol. 9, Issue 1
  • DOI: 10.1186/1471-2105-9-461

Rootstock-regulated gene expression patterns associated with fire blight resistance in apple
journal, January 2012

  • Jensen, Philip J.; Halbrendt, Noemi; Fazio, Gennaro
  • BMC Genomics, Vol. 13, Issue 1
  • DOI: 10.1186/1471-2164-13-9

Fast Bayesian inference for gene regulatory networks using ScanBMA
journal, January 2014

  • Young, William; Raftery, Adrian E.; Yeung, Ka
  • BMC Systems Biology, Vol. 8, Issue 1
  • DOI: 10.1186/1752-0509-8-47

Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
journal, November 2011


An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007


Regularization Paths for Generalized Linear Models via Coordinate Descent
journal, January 2010

  • Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert
  • Journal of Statistical Software, Vol. 33, Issue 1
  • DOI: 10.18637/jss.v033.i01

Investigating Causal Relations by Econometric Models and Cross-spectral Methods
journal, August 1969


Works referencing / citing this record:

Computational dynamic approaches for temporal omics data with applications to systems medicine
journal, June 2017


Computational dynamic approaches for temporal omics data with applications to systems medicine
journal, June 2017


Prophetic Granger Causality to infer gene regulatory networks
journal, December 2017