DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Disentangling multidimensional spatio-temporal data into their common and aberrant responses

Abstract

With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinctmore » subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data.« less

Authors:
 [1];  [2];  [3];  [3];  [1];  [2];  [4];  [5]
  1. Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences.
  2. Oregon Health and Science Univ., Portland, OR (United States). Dept. of Biomedical Engineering and the Center for Spatial Systems Biomedicine.
  3. Univ. of California, San Francisco, CA (United States). Dept. of Medicine.
  4. Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Life Sciences Division.
  5. Swiss Institute of Bioinformatics (Switzerland)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE; National Cancer Institute (NCI)
OSTI Identifier:
1212472
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
PLoS ONE
Additional Journal Information:
Journal Volume: 10; Journal Issue: 4; Journal ID: ISSN 1932-6203
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; gene expression; gene regulatory networks; genetic networks; neurons; dynamic response; action potentials; signaling networks; breast cancer

Citation Formats

Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., and Lisacek, Frederique. Disentangling multidimensional spatio-temporal data into their common and aberrant responses. United States: N. p., 2015. Web. doi:10.1371/journal.pone.0121607.
Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., & Lisacek, Frederique. Disentangling multidimensional spatio-temporal data into their common and aberrant responses. United States. https://doi.org/10.1371/journal.pone.0121607
Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., and Lisacek, Frederique. Wed . "Disentangling multidimensional spatio-temporal data into their common and aberrant responses". United States. https://doi.org/10.1371/journal.pone.0121607. https://www.osti.gov/servlets/purl/1212472.
@article{osti_1212472,
title = {Disentangling multidimensional spatio-temporal data into their common and aberrant responses},
author = {Chang, Young Hwan and Korkola, James and Amin, Dhara N. and Moasser, Mark M. and Carmena, Jose M. and Gray, Joe W. and Tomlin, Claire J. and Lisacek, Frederique},
abstractNote = {With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinct subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data.},
doi = {10.1371/journal.pone.0121607},
journal = {PLoS ONE},
number = 4,
volume = 10,
place = {United States},
year = {Wed Apr 22 00:00:00 EDT 2015},
month = {Wed Apr 22 00:00:00 EDT 2015}
}

Works referenced in this record:

PHLPP: A Phosphatase that Directly Dephosphorylates Akt, Promotes Apoptosis, and Suppresses Tumor Growth
journal, April 2005


Inferring cluster-based networks from differently stimulated multiple time-course gene expression data
journal, March 2010


Support for a synaptic chain model of neuronal sequence generation
journal, October 2010

  • Long, Michael A.; Jin, Dezhe Z.; Fee, Michale S.
  • Nature, Vol. 468, Issue 7322
  • DOI: 10.1038/nature09514

Resiliency and Vulnerability in the HER2-HER3 Tumorigenic Driver
journal, January 2010


Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
journal, October 2013


PHLiPPing the switch on Akt and protein kinase C signaling
journal, August 2008


Rare cancer-specific mutations in PIK3CA show gain of function
journal, March 2007

  • Gymnopoulos, M.; Elsliger, M. -A.; Vogt, P. K.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 13
  • DOI: 10.1073/pnas.0701005104

Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification
conference, December 2006

  • Deegalla, Sampath; Bostrom, Henrik
  • 2006 5th International Conference on Machine Learning and Applications (ICMLA'06)
  • DOI: 10.1109/icmla.2006.43

DEPTOR Is an mTOR Inhibitor Frequently Overexpressed in Multiple Myeloma Cells and Required for Their Survival
journal, May 2009


Subtype and pathway specific responses to anticancer compounds in breast cancer
journal, October 2011

  • Heiser, L. M.; Sadanandam, A.; Kuo, W. -L.
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 8
  • DOI: 10.1073/pnas.1018854108

Temporal Complexity and Heterogeneity of Single-Neuron Activity in Premotor and Motor Cortex
journal, June 2007

  • Churchland, Mark M.; Shenoy, Krishna V.
  • Journal of Neurophysiology, Vol. 97, Issue 6
  • DOI: 10.1152/jn.00095.2007

Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
journal, May 2003

  • Segal, Eran; Shapira, Michael; Regev, Aviv
  • Nature Genetics, Vol. 34, Issue 2
  • DOI: 10.1038/ng1165

Competitive Hebbian learning through spike-timing-dependent synaptic plasticity
journal, September 2000

  • Song, Sen; Miller, Kenneth D.; Abbott, L. F.
  • Nature Neuroscience, Vol. 3, Issue 9
  • DOI: 10.1038/78829

Cluster analysis and display of genome-wide expression patterns
journal, December 1998

  • Eisen, M. B.; Spellman, P. T.; Brown, P. O.
  • Proceedings of the National Academy of Sciences, Vol. 95, Issue 25
  • DOI: 10.1073/pnas.95.25.14863

Analysis of Time-Series Gene Expression Data: Methods, Challenges, and Opportunities
journal, August 2007


The big challenges of big data
journal, June 2013


A Technical Assessment of the Utility of Reverse Phase Protein Arrays for the Study of the Functional Proteome in Non-microdissected Human Breast Cancers
journal, October 2010

  • Hennessy, Bryan T.; Lu, Yiling; Gonzalez-Angulo, Ana Maria
  • Clinical Proteomics, Vol. 6, Issue 4
  • DOI: 10.1007/s12014-010-9055-y

A neuronal learning rule for sub-millisecond temporal coding
journal, September 1996

  • Gerstner, Wulfram; Kempter, Richard; van Hemmen, J. Leo
  • Nature, Vol. 383, Issue 6595
  • DOI: 10.1038/383076a0

Neural population dynamics during reaching
journal, June 2012

  • Churchland, Mark M.; Cunningham, John P.; Kaufman, Matthew T.
  • Nature, Vol. 487, Issue 7405
  • DOI: 10.1038/nature11129

The big challenges of big data
journal, June 2013


Subtype and pathway specific responses to anticancer compounds in breast cancer
journal, October 2011

  • Heiser, L. M.; Sadanandam, A.; Kuo, W. -L.
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 8
  • DOI: 10.1073/pnas.1018854108

Cluster analysis and display of genome-wide expression patterns
journal, December 1998

  • Eisen, M. B.; Spellman, P. T.; Brown, P. O.
  • Proceedings of the National Academy of Sciences, Vol. 95, Issue 25
  • DOI: 10.1073/pnas.95.25.14863

Neural population dynamics during reaching
journal, June 2012

  • Churchland, Mark M.; Cunningham, John P.; Kaufman, Matthew T.
  • Nature, Vol. 487, Issue 7405
  • DOI: 10.1038/nature11129

Analysis of Time-Series Gene Expression Data: Methods, Challenges, and Opportunities
journal, August 2007


Temporal Complexity and Heterogeneity of Single-Neuron Activity in Premotor and Motor Cortex
journal, June 2007

  • Churchland, Mark M.; Shenoy, Krishna V.
  • Journal of Neurophysiology, Vol. 97, Issue 6
  • DOI: 10.1152/jn.00095.2007

A neuronal learning rule for sub-millisecond temporal coding
journal, September 1996

  • Gerstner, Wulfram; Kempter, Richard; van Hemmen, J. Leo
  • Nature, Vol. 383, Issue 6595
  • DOI: 10.1038/383076a0

Competitive Hebbian learning through spike-timing-dependent synaptic plasticity
journal, September 2000

  • Song, Sen; Miller, Kenneth D.; Abbott, L. F.
  • Nature Neuroscience, Vol. 3, Issue 9
  • DOI: 10.1038/78829

Support for a synaptic chain model of neuronal sequence generation
journal, October 2010

  • Long, Michael A.; Jin, Dezhe Z.; Fee, Michale S.
  • Nature, Vol. 468, Issue 7322
  • DOI: 10.1038/nature09514

Robust principal component analysis?
journal, May 2011


Accelerated low-rank visual recovery by random projection
conference, June 2011


Bilateral random projections
conference, July 2012

  • Zhou, Tianyi; Tao, Dacheng
  • 2012 IEEE International Symposium on Information Theory Proceedings
  • DOI: 10.1109/ISIT.2012.6283064

Random projection in dimensionality reduction: applications to image and text data
conference, January 2001

  • Bingham, Ella; Mannila, Heikki
  • Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '01
  • DOI: 10.1145/502512.502546

Random Projections of Smooth Manifolds
journal, December 2007

  • Baraniuk, Richard G.; Wakin, Michael B.
  • Foundations of Computational Mathematics, Vol. 9, Issue 1
  • DOI: 10.1007/s10208-007-9011-z

DEPTOR Is an mTOR Inhibitor Frequently Overexpressed in Multiple Myeloma Cells and Required for Their Survival
journal, May 2009


PHLiPPing the switch on Akt and protein kinase C signaling
journal, August 2008


PHLPP: A Phosphatase that Directly Dephosphorylates Akt, Promotes Apoptosis, and Suppresses Tumor Growth
journal, April 2005


A Technical Assessment of the Utility of Reverse Phase Protein Arrays for the Study of the Functional Proteome in Non-microdissected Human Breast Cancers
journal, October 2010

  • Hennessy, Bryan T.; Lu, Yiling; Gonzalez-Angulo, Ana Maria
  • Clinical Proteomics, Vol. 6, Issue 4
  • DOI: 10.1007/s12014-010-9055-y

Rare cancer-specific mutations in PIK3CA show gain of function
journal, March 2007

  • Gymnopoulos, M.; Elsliger, M. -A.; Vogt, P. K.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 13
  • DOI: 10.1073/pnas.0701005104

Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
journal, October 2013


Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
journal, May 2003

  • Segal, Eran; Shapira, Michael; Regev, Aviv
  • Nature Genetics, Vol. 34, Issue 2
  • DOI: 10.1038/ng1165

Inferring cluster-based networks from differently stimulated multiple time-course gene expression data
journal, March 2010