DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Optimal adjustment sets for causal query estimation in partially observed biomolecular networks

Journal Article · · Bioinformatics

Abstract Causal query estimation in biomolecular networks commonly selects a ‘valid adjustment set’, i.e. a subset of network variables that eliminates the bias of the estimator. A same query may have multiple valid adjustment sets, each with a different variance. When networks are partially observed, current methods use graph-based criteria to find an adjustment set that minimizes asymptotic variance. Unfortunately, many models that share the same graph topology, and therefore same functional dependencies, may differ in the processes that generate the observational data. In these cases, the topology-based criteria fail to distinguish the variances of the adjustment sets. This deficiency can lead to sub-optimal adjustment sets, and to miss-characterization of the effect of the intervention. We propose an approach for deriving ‘optimal adjustment sets’ that takes into account the nature of the data, bias and finite-sample variance of the estimator, and cost. It empirically learns the data generating processes from historical experimental data, and characterizes the properties of the estimators by simulation. We demonstrate the utility of the proposed approach in four biomolecular Case studies with different topologies and different data generation processes. The implementation and reproducible Case studies are at https://github.com/srtaheri/OptimalAdjustmentSet.

Research Organization:
Univ. of Washington, Seattle, WA (United States)
Sponsoring Organization:
Defense Advanced Research Projects Agency (DARPA); National Institutes of Health (NIH); USDOE; USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
SC0023091
OSTI ID:
1987738
Journal Information:
Bioinformatics, Journal Name: Bioinformatics Journal Issue: Supplement_1 Vol. 39; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (19)

Genetic Data Simulators and their Applications: An Overview journal December 2014
Causality book January 2009
Exact stochastic simulation of coupled chemical reactions journal December 1977
OmniPath: guidelines and gateway for literature-curated signaling pathway resources journal November 2016
The Escherichia coli transcriptome mostly consists of independently regulated modules journal December 2019
Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks journal January 2020
A benchmark study of simulation methods for single-cell RNA sequencing data journal November 2021
Interventions and Causal Inference journal December 2007
AIPW: An R Package for Augmented Inverse Probability–Weighted Estimation of Average Causal Effects journal July 2021
The reactome pathway knowledgebase 2022 journal November 2021
Graphical Criteria for Efficient Total Effect Estimation Via Adjustment in Causal Linear Models
  • Henckel, Leonard; Perković, Emilija; Maathuis, Marloes H.
  • Journal of the Royal Statistical Society Series B: Statistical Methodology, Vol. 84, Issue 2 https://doi.org/10.1111/rssb.12451
journal March 2022
Graphs for Margins of Bayesian Networks: Graphs for margins of Bayesian networks journal November 2015
Generative adversarial networks journal October 2020
ReSeq simulates realistic Illumina high-throughput sequencing data journal February 2021
scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured journal May 2021
Estimating Causal Effects Using Weighting-Based Estimators journal April 2020
Causal Inference Using Graphical Models with theRPackagepcalg journal January 2012
Identifying Causal Effects with the R Package causaleffect journal January 2017
The EcoCyc Database in 2021 journal July 2021