skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks

Journal Article · · Proceedings of the National Academy of Sciences of the United States of America
 [1];  [2];  [3];  [3];  [4];  [3]
  1. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology
  2. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology; Walmart Labs, San Bruno, CA (United States)
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology
  4. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences

Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC); National Science Foundation (NSF); US Air Force Office of Scientific Research (AFOSR); National Institutes of Health (NIH)
Grant/Contract Number:
AC02-05CH11231; CCF-0939370; R01 GM076655; R01 GM097231; 1U01HG007031-01
OSTI ID:
1379291
Journal Information:
Proceedings of the National Academy of Sciences of the United States of America, Vol. 113, Issue 16; ISSN 0027-8424
Publisher:
National Academy of Sciences, Washington, DC (United States)Copyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 77 works
Citation information provided by
Web of Science

References (53)

Automatic Annotation of Spatial Expression Patterns via Sparse Bayesian Factor Models journal July 2011
Regulation of a segmentation stripe by overlapping activators and repressors in the Drosophila embryo journal November 1991
SPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images. text January 2010
GINI: From ISH Images to Gene Interaction Networks journal October 2013
The mouse Gene Expression Database (GXD): 2007 update journal January 2007
Global analysis of patterns of gene expression during Drosophila embryogenesis journal January 2007
Fast and robust fixed-point algorithms for independent component analysis journal May 1999
Automated annotation of gene expression image sequences via non-parametric factor analysis and conditional random fields journal June 2013
The mesoderm determinant Snail collaborates with related zinc-finger proteins to control Drosophila neurogenesis journal November 1999
Automated annotation of developmental stages of Drosophila embryos in images containing spatial patterns of expression journal December 2013
GINI: from ISH images to gene interaction networks. text January 2013
An atlas of differential gene expression during early Xenopus embryogenesis journal March 2005
Automatic recognition and annotation of gene expression patterns of fly embryos journal January 2007
Systematic image‐driven analysis of the spatial Drosophila embryonic expression landscape journal January 2010
SPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images journal June 2010
Automatic Relevance Determination in Nonnegative Matrix Factorization with the /spl beta/-Divergence journal July 2013
Global Analysis of mRNA Localization Reveals a Prominent Role in Organizing Cellular Architecture and Function journal October 2007
Transcriptional landscape of the prenatal human brain journal April 2014
Learning the parts of objects by non-negative matrix factorization journal October 1999
Spatial expression of transcription factors in Drosophila embryonic organ development journal January 2013
A stable approach for model order selection in nonnegative matrix factorization journal March 2015
Metagenes and molecular pattern discovery using matrix factorization journal March 2004
An anatomically comprehensive atlas of the adult human brain transcriptome journal September 2012
Genes Affecting the Segmental Subdivision of the Drosophila Embryo journal January 1985
The gap gene network journal October 2010
Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images journal November 2009
Mutations affecting segment number and polarity in Drosophila journal October 1980
Spatial and temporal diversity in genomic instability processes defines lung cancer evolution journal October 2014
Automatic image analysis for gene expression patterns of fly embryos journal July 2007
Genome-scale functional characterization of Drosophila developmental enhancers in vivo journal June 2014
Review on statistical methods for gene network reconstruction using expression data journal December 2014
Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing journal March 2012
Genome-wide atlas of gene expression in the adult mouse brain journal December 2006
A fate map for the larval epidermis ofDrosophila melanogaster: localized cuticle defects following irradiation of the blastoderm with an ultraviolet laser microbeam journal December 1979
A Resource for Manipulating Gene Expression and Analyzing cis-Regulatory Modules in the Drosophila CNS journal October 2012
SPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images. text January 2010
GINI: from ISH images to gene interaction networks. text January 2013
A Resource for Manipulating Gene Expression and Analyzing cis-Regulatory Modules in the Drosophila CNS text January 2012
Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing journal January 2013
Tumour heterogeneity in the clinic journal September 2013
A Survey of 6,300 Genomic Fragments for cis-Regulatory Activity in the Imaginal Discs of Drosophila melanogaster text January 2012
Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing journal October 2014
Inference of Tumor Evolution during Chemotherapy by Computational Modeling and In Situ Analysis of Genetic and Phenotypic Cellular Diversity journal February 2014
Automatic recognition and annotation of gene expression patterns of fly embryos journal June 2007
A Survey of 6,300 Genomic Fragments for cis-Regulatory Activity in the Imaginal Discs of Drosophila melanogaster journal October 2012
A Combinatorial Code for Pattern Formation in Drosophila Oogenesis journal November 2008
A GAL4-Driver Line Resource for Drosophila Neurobiology journal October 2012
Gene expression profiles of transcription factors and signaling molecules in the ascidian embryo: towards a comprehensive understanding of gene networks journal August 2004
The developmental expression dynamics of Drosophila melanogaster transcription factors journal January 2010
EMAGE mouse embryo spatial gene expression database: 2010 update journal September 2009
Autonomous concentration-dependent activation and repression of Kruppel by hunchback in the Drosophila embryo journal October 1994
Posterior stripe expression of hunchback is driven from two promoters by a common enhancer element journal September 1995
Genetic and Phenotypic Diversity in Breast Tumor Metastases journal January 2014

Cited By (22)

SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping
  • Perros, Ioakeim; Papalexakis, Evangelos E.; Park, Haesun
  • KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3219819.3219999
conference August 2018
Distributed non-negative matrix factorization with determination of the number of latent features journal February 2020
Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations journal May 2020
AnnoFly: annotating Drosophila embryonic images based on an attention-enhanced RNN model journal January 2019
Latent factor modelling of scRNA-seq data uncovers novel pathways dysregulated in cell subsets of autoimmune disease patients journal November 2019
Seizure pathways change on circadian and slower timescales in individual patients with focal epilepsy journal May 2020
Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities journal September 2017
Sequential compression of gene expression across dimensionalities and methods reveals no single best method or dimensionality posted_content September 2019
ADAGE signature analysis: differential expression analysis with data-defined gene sets journal November 2017
ADAGE signature analysis: differential expression analysis with data-defined gene sets journal June 2017
Predicting gene regulatory interactions based on spatial gene expression data and deep learning journal September 2019
Definitions, methods, and applications in interpretable machine learning journal October 2019
Developmental topography of cortical thickness during infancy journal July 2019
Refining interaction search through signed iterative Random Forests preprint January 2018
Enter the Matrix: Factorization Uncovers Knowledge from Omics journal October 2018
Multiple Partial Regularized Nonnegative Matrix Factorization for Predicting Ontological Functions of lncRNAs journal January 2019
How much does your data exploration overfit? Controlling bias via information usage preprint January 2015
A Unified Joint Matrix Factorization Framework for Data Integration preprint January 2017
Diverse spatial expression patterns emerge from common transcription bursting kinetics preprint January 2017
SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping preprint January 2018
Unique Sharp Local Minimum in $\ell_1$-minimization Complete Dictionary Learning preprint January 2019
TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records preprint January 2019