DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Sparse multitask regression for identifying common mechanism of response to therapeutic targets

Abstract

Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multitask regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancermore » cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use.« less

Authors:
 [1];  [1];  [1]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Life Sciences Division
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
OSTI Identifier:
1625267
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Bioinformatics
Additional Journal Information:
Journal Volume: 26; Journal Issue: 12; Journal ID: ISSN 1367-4803
Publisher:
Oxford University Press
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; 97 MATHEMATICS AND COMPUTING; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Computer Science; Mathematical & Computational Biology; Mathematics

Citation Formats

Zhang, K., Gray, J. W., and Parvin, B. Sparse multitask regression for identifying common mechanism of response to therapeutic targets. United States: N. p., 2010. Web. doi:10.1093/bioinformatics/btq181.
Zhang, K., Gray, J. W., & Parvin, B. Sparse multitask regression for identifying common mechanism of response to therapeutic targets. United States. https://doi.org/10.1093/bioinformatics/btq181
Zhang, K., Gray, J. W., and Parvin, B. Sun . "Sparse multitask regression for identifying common mechanism of response to therapeutic targets". United States. https://doi.org/10.1093/bioinformatics/btq181. https://www.osti.gov/servlets/purl/1625267.
@article{osti_1625267,
title = {Sparse multitask regression for identifying common mechanism of response to therapeutic targets},
author = {Zhang, K. and Gray, J. W. and Parvin, B.},
abstractNote = {Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multitask regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use.},
doi = {10.1093/bioinformatics/btq181},
journal = {Bioinformatics},
number = 12,
volume = 26,
place = {United States},
year = {Sun Jun 06 00:00:00 EDT 2010},
month = {Sun Jun 06 00:00:00 EDT 2010}
}

Works referenced in this record:

Unsupervised feature selection via two-way ordering in gene expression analysis
journal, July 2003


An Interior-Point Method for Large-Scale -Regularized Least Squares
journal, December 2007

  • Kim, Seung-Jean; Koh, K.; Lustig, M.
  • IEEE Journal of Selected Topics in Signal Processing, Vol. 1, Issue 4
  • DOI: 10.1109/JSTSP.2007.910971

Discovering statistically significant biclusters in gene expression data
journal, July 2002


Geometric approach to segmentation and protein localization in cell culture assays
journal, January 2007


Unsupervised feature selection via two-way ordering in gene expression analysis
journal, July 2003


Geometric approach to segmentation and protein localization in cell culture assays
journal, January 2007


CLCA2 tumour suppressor gene in 1p31 is epigenetically regulated in breast cancer
journal, February 2004


CLCA2 tumour suppressor gene in 1p31 is epigenetically regulated in breast cancer
journal, February 2004


Abrogated Response to Cellular Stress Identifies DCIS Associated with Subsequent Tumor Events and Defines Basal-like Breast Tumors
journal, November 2007


Direct Clustering of a Data Matrix
journal, March 1972

  • Hartigan, J. A.
  • Journal of the American Statistical Association, Vol. 67, Issue 337
  • DOI: 10.2307/2284710

Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions
journal, April 2003


Compressed sensing
journal, April 2006


Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data
journal, April 2005


A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes
journal, December 2006


Structure and Transcriptional Regulation of the Human Cystatin A Gene
journal, July 1998

  • Takahashi, Hidetoshi; Asano, Kazuhiro; Kinouchi, Motoshi
  • Journal of Biological Chemistry, Vol. 273, Issue 28
  • DOI: 10.1074/jbc.273.28.17375

PAN1/NALP2/PYPAF2, an Inducible Inflammatory Mediator That Regulates NF-κB and Caspase-1 Activation in Macrophages
journal, December 2004

  • Bruey, Jean Marie; Bruey-Sedano, Nathalie; Newman, Ruchi
  • Journal of Biological Chemistry, Vol. 279, Issue 50
  • DOI: 10.1074/jbc.m406741200

Atomic Decomposition by Basis Pursuit
journal, January 2001


Molecular Predictors of 3D Morphogenesis by Breast Cancer Cell Lines in 3D Culture
journal, February 2010


Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling
journal, May 2003

  • Tegner, J.; Yeung, M. K. S.; Hasty, J.
  • Proceedings of the National Academy of Sciences, Vol. 100, Issue 10
  • DOI: 10.1073/pnas.0933416100

Direct Clustering of a Data Matrix
journal, March 1972


Response projected clustering for direct association with physiological and clinical response data
journal, January 2008


hCLCA2 Is a p53-Inducible Inhibitor of Breast Cancer Cell Proliferation
journal, August 2009


Response projected clustering for direct association with physiological and clinical response data
journal, January 2008


Gene-based approach to human gene-phenotype correlations
journal, October 1997


A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes
journal, December 2006


Multidimensional Profiling of Cell Surface Proteins and Nuclear Markers
journal, January 2010

  • Han, Ju; Chang, Hang; Andarawewa, K.
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 7, Issue 1, p. 80-90
  • DOI: 10.1109/TCBB.2008.134

Multitask Learning
journal, January 1997


Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions
journal, April 2003


Co-clustering documents and words using bipartite spectral graph partitioning
conference, January 2001

  • Dhillon, Inderjit S.
  • Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '01
  • DOI: 10.1145/502512.502550

Direct Clustering of a Data Matrix
journal, March 1972


Learning a meta-level prior for feature relevance from multiple related tasks
conference, June 2007

  • Lee, Su-In; Chatalbashev, Vassil; Vickrey, David
  • Proceedings of the 24th international conference on Machine learning
  • DOI: 10.1145/1273496.1273558

Probabilistic Joint Feature Selection for Multi-task Learning
conference, April 2007

  • Xiong, Tao; Bi, Jinbo; Rao, Bharat
  • Proceedings of the 2007 SIAM International Conference on Data Mining
  • DOI: 10.1137/1.9781611972771.30

Works referencing / citing this record:

Using multitask classification methods to investigate the kinase-specific phosphorylation sites
journal, June 2012


A Survey on Multi-Task Learning
journal, January 2021


Deep multi-task learning for individuals origin–destination matrices estimation from census data
journal, November 2019

  • Katranji, Mehdi; Kraiem, Sami; Moalic, Laurent
  • Data Mining and Knowledge Discovery, Vol. 34, Issue 1
  • DOI: 10.1007/s10618-019-00662-y

Integrative analysis of multiple diverse omics datasets by sparse group multitask regression
journal, October 2014

  • Lin, Dongdong; Zhang, Jigang; Li, Jingyao
  • Frontiers in Cell and Developmental Biology, Vol. 2
  • DOI: 10.3389/fcell.2014.00062

An ensemble method approach to investigate kinase-specific phosphorylation sites
journal, May 2014

  • Datta, Sutapa; Mukhopadhyay, Subhasis
  • International Journal of Nanomedicine
  • DOI: 10.2147/ijn.s57526

An overview of multi-task learning
journal, September 2017