skip to main content

DOE PAGESDOE PAGES

Title: Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks

Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap genemore » network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less
Authors:
 [1] ;  [2] ;  [3] ;  [3] ;  [4] ;  [3]
  1. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology
  2. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology; Walmart Labs, San Bruno, CA (United States)
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Division of Environmental Genomics and Systems Biology
  4. Univ. of California, Berkeley, CA (United States). Dept. of Statistics; Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences
Publication Date:
Grant/Contract Number:
AC02-05CH11231; CCF-0939370; R01 GM076655; R01 GM097231; 1U01HG007031-01
Type:
Accepted Manuscript
Journal Name:
Proceedings of the National Academy of Sciences of the United States of America
Additional Journal Information:
Journal Volume: 113; Journal Issue: 16; Journal ID: ISSN 0027-8424
Publisher:
National Academy of Sciences, Washington, DC (United States)
Research Org:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org:
USDOE Office of Science (SC); National Science Foundation (NSF); US Air Force Office of Scientific Research (AFOSR); National Institutes of Health (NIH)
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; principal patterns; stability selection; sparse decomposition; spatial gene expression; spatially local networks
OSTI Identifier:
1379291

Wu, Siqi, Joseph, Antony, Hammonds, Ann S., Celniker, Susan E., Yu, Bin, and Frise, Erwin. Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks. United States: N. p., Web. doi:10.1073/pnas.1521171113.
Wu, Siqi, Joseph, Antony, Hammonds, Ann S., Celniker, Susan E., Yu, Bin, & Frise, Erwin. Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks. United States. doi:10.1073/pnas.1521171113.
Wu, Siqi, Joseph, Antony, Hammonds, Ann S., Celniker, Susan E., Yu, Bin, and Frise, Erwin. 2016. "Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks". United States. doi:10.1073/pnas.1521171113. https://www.osti.gov/servlets/purl/1379291.
@article{osti_1379291,
title = {Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks},
author = {Wu, Siqi and Joseph, Antony and Hammonds, Ann S. and Celniker, Susan E. and Yu, Bin and Frise, Erwin},
abstractNote = {Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.},
doi = {10.1073/pnas.1521171113},
journal = {Proceedings of the National Academy of Sciences of the United States of America},
number = 16,
volume = 113,
place = {United States},
year = {2016},
month = {4}
}