Disentangling multidimensional spatio-temporal data into their common and aberrant responses
Abstract
With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinctmore »
- Authors:
-
- Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences.
- Oregon Health and Science Univ., Portland, OR (United States). Dept. of Biomedical Engineering and the Center for Spatial Systems Biomedicine.
- Univ. of California, San Francisco, CA (United States). Dept. of Medicine.
- Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Life Sciences Division.
- Swiss Institute of Bioinformatics (Switzerland)
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE; National Cancer Institute (NCI)
- OSTI Identifier:
- 1212472
- Grant/Contract Number:
- AC02-05CH11231
- Resource Type:
- Accepted Manuscript
- Journal Name:
- PLoS ONE
- Additional Journal Information:
- Journal Volume: 10; Journal Issue: 4; Journal ID: ISSN 1932-6203
- Publisher:
- Public Library of Science
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; gene expression; gene regulatory networks; genetic networks; neurons; dynamic response; action potentials; signaling networks; breast cancer
Citation Formats
Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., and Lisacek, Frederique. Disentangling multidimensional spatio-temporal data into their common and aberrant responses. United States: N. p., 2015.
Web. doi:10.1371/journal.pone.0121607.
Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., & Lisacek, Frederique. Disentangling multidimensional spatio-temporal data into their common and aberrant responses. United States. https://doi.org/10.1371/journal.pone.0121607
Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J., and Lisacek, Frederique. Wed .
"Disentangling multidimensional spatio-temporal data into their common and aberrant responses". United States. https://doi.org/10.1371/journal.pone.0121607. https://www.osti.gov/servlets/purl/1212472.
@article{osti_1212472,
title = {Disentangling multidimensional spatio-temporal data into their common and aberrant responses},
author = {Chang, Young Hwan and Korkola, James and Amin, Dhara N. and Moasser, Mark M. and Carmena, Jose M. and Gray, Joe W. and Tomlin, Claire J. and Lisacek, Frederique},
abstractNote = {With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinct subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data.},
doi = {10.1371/journal.pone.0121607},
journal = {PLoS ONE},
number = 4,
volume = 10,
place = {United States},
year = {Wed Apr 22 00:00:00 EDT 2015},
month = {Wed Apr 22 00:00:00 EDT 2015}
}
Works referenced in this record:
PHLPP: A Phosphatase that Directly Dephosphorylates Akt, Promotes Apoptosis, and Suppresses Tumor Growth
journal, April 2005
- Gao, Tianyan; Furnari, Frank; Newton, Alexandra C.
- Molecular Cell, Vol. 18, Issue 1
Inferring cluster-based networks from differently stimulated multiple time-course gene expression data
journal, March 2010
- Shiraishi, Yuichi; Kimura, Shuhei; Okada, Mariko
- Bioinformatics, Vol. 26, Issue 8
Support for a synaptic chain model of neuronal sequence generation
journal, October 2010
- Long, Michael A.; Jin, Dezhe Z.; Fee, Michale S.
- Nature, Vol. 468, Issue 7322
Resiliency and Vulnerability in the HER2-HER3 Tumorigenic Driver
journal, January 2010
- Amin, D. N.; Sergina, N.; Ahuja, D.
- Science Translational Medicine, Vol. 2, Issue 16
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
journal, October 2013
- Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang
- PLoS Computational Biology, Vol. 9, Issue 10
PHLiPPing the switch on Akt and protein kinase C signaling
journal, August 2008
- Brognard, John; Newton, Alexandra C.
- Trends in Endocrinology & Metabolism, Vol. 19, Issue 6
Rare cancer-specific mutations in PIK3CA show gain of function
journal, March 2007
- Gymnopoulos, M.; Elsliger, M. -A.; Vogt, P. K.
- Proceedings of the National Academy of Sciences, Vol. 104, Issue 13
Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification
conference, December 2006
- Deegalla, Sampath; Bostrom, Henrik
- 2006 5th International Conference on Machine Learning and Applications (ICMLA'06)
DEPTOR Is an mTOR Inhibitor Frequently Overexpressed in Multiple Myeloma Cells and Required for Their Survival
journal, May 2009
- Peterson, Timothy R.; Laplante, Mathieu; Thoreen, Carson C.
- Cell, Vol. 137, Issue 5
Subtype and pathway specific responses to anticancer compounds in breast cancer
journal, October 2011
- Heiser, L. M.; Sadanandam, A.; Kuo, W. -L.
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 8
Temporal Complexity and Heterogeneity of Single-Neuron Activity in Premotor and Motor Cortex
journal, June 2007
- Churchland, Mark M.; Shenoy, Krishna V.
- Journal of Neurophysiology, Vol. 97, Issue 6
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
journal, May 2003
- Segal, Eran; Shapira, Michael; Regev, Aviv
- Nature Genetics, Vol. 34, Issue 2
Competitive Hebbian learning through spike-timing-dependent synaptic plasticity
journal, September 2000
- Song, Sen; Miller, Kenneth D.; Abbott, L. F.
- Nature Neuroscience, Vol. 3, Issue 9
Cluster analysis and display of genome-wide expression patterns
journal, December 1998
- Eisen, M. B.; Spellman, P. T.; Brown, P. O.
- Proceedings of the National Academy of Sciences, Vol. 95, Issue 25
Analysis of Time-Series Gene Expression Data: Methods, Challenges, and Opportunities
journal, August 2007
- Androulakis, I. P.; Yang, E.; Almon, R. R.
- Annual Review of Biomedical Engineering, Vol. 9, Issue 1
A Technical Assessment of the Utility of Reverse Phase Protein Arrays for the Study of the Functional Proteome in Non-microdissected Human Breast Cancers
journal, October 2010
- Hennessy, Bryan T.; Lu, Yiling; Gonzalez-Angulo, Ana Maria
- Clinical Proteomics, Vol. 6, Issue 4
A neuronal learning rule for sub-millisecond temporal coding
journal, September 1996
- Gerstner, Wulfram; Kempter, Richard; van Hemmen, J. Leo
- Nature, Vol. 383, Issue 6595
Neural population dynamics during reaching
journal, June 2012
- Churchland, Mark M.; Cunningham, John P.; Kaufman, Matthew T.
- Nature, Vol. 487, Issue 7405
Subtype and pathway specific responses to anticancer compounds in breast cancer
journal, October 2011
- Heiser, L. M.; Sadanandam, A.; Kuo, W. -L.
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 8
Cluster analysis and display of genome-wide expression patterns
journal, December 1998
- Eisen, M. B.; Spellman, P. T.; Brown, P. O.
- Proceedings of the National Academy of Sciences, Vol. 95, Issue 25
Neural population dynamics during reaching
journal, June 2012
- Churchland, Mark M.; Cunningham, John P.; Kaufman, Matthew T.
- Nature, Vol. 487, Issue 7405
Analysis of Time-Series Gene Expression Data: Methods, Challenges, and Opportunities
journal, August 2007
- Androulakis, I. P.; Yang, E.; Almon, R. R.
- Annual Review of Biomedical Engineering, Vol. 9, Issue 1
Temporal Complexity and Heterogeneity of Single-Neuron Activity in Premotor and Motor Cortex
journal, June 2007
- Churchland, Mark M.; Shenoy, Krishna V.
- Journal of Neurophysiology, Vol. 97, Issue 6
A neuronal learning rule for sub-millisecond temporal coding
journal, September 1996
- Gerstner, Wulfram; Kempter, Richard; van Hemmen, J. Leo
- Nature, Vol. 383, Issue 6595
Competitive Hebbian learning through spike-timing-dependent synaptic plasticity
journal, September 2000
- Song, Sen; Miller, Kenneth D.; Abbott, L. F.
- Nature Neuroscience, Vol. 3, Issue 9
Support for a synaptic chain model of neuronal sequence generation
journal, October 2010
- Long, Michael A.; Jin, Dezhe Z.; Fee, Michale S.
- Nature, Vol. 468, Issue 7322
Robust principal component analysis?
journal, May 2011
- Candès, Emmanuel J.; Li, Xiaodong; Ma, Yi
- Journal of the ACM, Vol. 58, Issue 3
Accelerated low-rank visual recovery by random projection
conference, June 2011
- Mu, Yadong; Dong, Jian; Yuan, Xiaotong
- CVPR 2011
Bilateral random projections
conference, July 2012
- Zhou, Tianyi; Tao, Dacheng
- 2012 IEEE International Symposium on Information Theory Proceedings
Random projection in dimensionality reduction: applications to image and text data
conference, January 2001
- Bingham, Ella; Mannila, Heikki
- Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '01
Random Projections of Smooth Manifolds
journal, December 2007
- Baraniuk, Richard G.; Wakin, Michael B.
- Foundations of Computational Mathematics, Vol. 9, Issue 1
DEPTOR Is an mTOR Inhibitor Frequently Overexpressed in Multiple Myeloma Cells and Required for Their Survival
journal, May 2009
- Peterson, Timothy R.; Laplante, Mathieu; Thoreen, Carson C.
- Cell, Vol. 137, Issue 5
PHLiPPing the switch on Akt and protein kinase C signaling
journal, August 2008
- Brognard, John; Newton, Alexandra C.
- Trends in Endocrinology & Metabolism, Vol. 19, Issue 6
PHLPP: A Phosphatase that Directly Dephosphorylates Akt, Promotes Apoptosis, and Suppresses Tumor Growth
journal, April 2005
- Gao, Tianyan; Furnari, Frank; Newton, Alexandra C.
- Molecular Cell, Vol. 18, Issue 1
A Technical Assessment of the Utility of Reverse Phase Protein Arrays for the Study of the Functional Proteome in Non-microdissected Human Breast Cancers
journal, October 2010
- Hennessy, Bryan T.; Lu, Yiling; Gonzalez-Angulo, Ana Maria
- Clinical Proteomics, Vol. 6, Issue 4
Rare cancer-specific mutations in PIK3CA show gain of function
journal, March 2007
- Gymnopoulos, M.; Elsliger, M. -A.; Vogt, P. K.
- Proceedings of the National Academy of Sciences, Vol. 104, Issue 13
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
journal, October 2013
- Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang
- PLoS Computational Biology, Vol. 9, Issue 10
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
journal, May 2003
- Segal, Eran; Shapira, Michael; Regev, Aviv
- Nature Genetics, Vol. 34, Issue 2
Inferring cluster-based networks from differently stimulated multiple time-course gene expression data
journal, March 2010
- Shiraishi, Yuichi; Kimura, Shuhei; Okada, Mariko
- Bioinformatics, Vol. 26, Issue 8