Disentangling multidimensional spatio-temporal data into their common and aberrant responses
With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinctmore »
- Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences.
- Oregon Health and Science Univ., Portland, OR (United States). Dept. of Biomedical Engineering and the Center for Spatial Systems Biomedicine.
- Univ. of California, San Francisco, CA (United States). Dept. of Medicine.
- Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Life Sciences Division.
- Swiss Institute of Bioinformatics (Switzerland)
- Publication Date:
- OSTI Identifier:
- Grant/Contract Number:
- Accepted Manuscript
- Journal Name:
- PLoS ONE
- Additional Journal Information:
- Journal Volume: 10; Journal Issue: 4; Journal ID: ISSN 1932-6203
- Public Library of Science
- Research Org:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Org:
- USDOE; National Institutes of Health National Cancer Institute
- Country of Publication:
- United States
- 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING gene expression; gene regulatory networks; genetic networks; neurons; dynamic response; action potentials; signaling networks; breast cancer