skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research

Abstract

BackgroundCurrent multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines.ResultsThis paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks.ConclusionsInitial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.

Authors:
 [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1];  [2];  [2];  [3];  [4]
  1. Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  3. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  4. Minerva, San Francisco, CA (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC); National Institutes of Health (NIH)
OSTI Identifier:
1510031
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 19; Journal Issue: S18; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES

Citation Formats

Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, and Baughman, Matthew. CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research. United States: N. p., 2018. Web. doi:10.1186/s12859-018-2508-4.
Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, & Baughman, Matthew. CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research. United States. doi:10.1186/s12859-018-2508-4.
Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, and Baughman, Matthew. Fri . "CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research". United States. doi:10.1186/s12859-018-2508-4. https://www.osti.gov/servlets/purl/1510031.
@article{osti_1510031,
title = {CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research},
author = {Wozniak, Justin M. and Jain, Rajeev and Balaprakash, Prasanna and Ozik, Jonathan and Collier, Nicholson T. and Bauer, John and Xia, Fangfang and Brettin, Thomas and Stevens, Rick and Mohd-Yusof, Jamaludin and Cardona, Cristina Garcia and Essen, Brian Van and Baughman, Matthew},
abstractNote = {BackgroundCurrent multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines.ResultsThis paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks.ConclusionsInitial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.},
doi = {10.1186/s12859-018-2508-4},
journal = {BMC Bioinformatics},
issn = {1471-2105},
number = S18,
volume = 19,
place = {United States},
year = {2018},
month = {12}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: CANDLE/Supervisor overall architecture.

Save / Share:

Works referenced in this record:

A community effort to assess and improve drug sensitivity prediction algorithms
journal, June 2014

  • Costello, James C.; Heiser, Laura M.; Georgii, Elisabeth
  • Nature Biotechnology, Vol. 32, Issue 12
  • DOI: 10.1038/nbt.2877

Hyperopt: a Python library for model selection and hyperparameter optimization
journal, January 2015


Deep learning
journal, May 2015

  • LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
  • Nature, Vol. 521, Issue 7553
  • DOI: 10.1038/nature14539

Random Forests
journal, January 2001


Evolving Neural Networks through Augmenting Topologies
journal, June 2002


    Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.