CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research
Abstract
BackgroundCurrent multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines.ResultsThis paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks.ConclusionsInitial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.
- Authors:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Minerva, San Francisco, CA (United States)
- Publication Date:
- Research Org.:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC); National Institutes of Health (NIH)
- OSTI Identifier:
- 1510031
- Grant/Contract Number:
- AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Volume: 19; Journal Issue: S18; Journal ID: ISSN 1471-2105
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 60 APPLIED LIFE SCIENCES
Citation Formats
Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, and Baughman, Matthew. CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research. United States: N. p., 2018.
Web. doi:10.1186/s12859-018-2508-4.
Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, & Baughman, Matthew. CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research. United States. doi:10.1186/s12859-018-2508-4.
Wozniak, Justin M., Jain, Rajeev, Balaprakash, Prasanna, Ozik, Jonathan, Collier, Nicholson T., Bauer, John, Xia, Fangfang, Brettin, Thomas, Stevens, Rick, Mohd-Yusof, Jamaludin, Cardona, Cristina Garcia, Essen, Brian Van, and Baughman, Matthew. Fri .
"CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research". United States. doi:10.1186/s12859-018-2508-4. https://www.osti.gov/servlets/purl/1510031.
@article{osti_1510031,
title = {CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research},
author = {Wozniak, Justin M. and Jain, Rajeev and Balaprakash, Prasanna and Ozik, Jonathan and Collier, Nicholson T. and Bauer, John and Xia, Fangfang and Brettin, Thomas and Stevens, Rick and Mohd-Yusof, Jamaludin and Cardona, Cristina Garcia and Essen, Brian Van and Baughman, Matthew},
abstractNote = {BackgroundCurrent multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines.ResultsThis paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks.ConclusionsInitial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.},
doi = {10.1186/s12859-018-2508-4},
journal = {BMC Bioinformatics},
number = S18,
volume = 19,
place = {United States},
year = {2018},
month = {12}
}
Figures / Tables:

Works referenced in this record:
A community effort to assess and improve drug sensitivity prediction algorithms
journal, June 2014
- Costello, James C.; Heiser, Laura M.; Georgii, Elisabeth
- Nature Biotechnology, Vol. 32, Issue 12
Hyperopt: a Python library for model selection and hyperparameter optimization
journal, January 2015
- Bergstra, James; Komer, Brent; Eliasmith, Chris
- Computational Science & Discovery, Vol. 8, Issue 1
Deep learning
journal, May 2015
- LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
- Nature, Vol. 521, Issue 7553
Evolving Neural Networks through Augmenting Topologies
journal, June 2002
- Stanley, Kenneth O.; Miikkulainen, Risto
- Evolutionary Computation, Vol. 10, Issue 2
Figures / Tables found in this record: