DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High Performance Computing

Abstract

The application of data science in cancer research has been boosted by major advances in three primary areas: (1) Data: diversity, amount, and availability of biomedical data; (2) Advances in Artificial Intelligence (AI) and Machine Learning (ML) algorithms that enable learning from complex, large-scale data; and (3) Advances in computer architectures allowing unprecedented acceleration of simulation and machine learning algorithms. These advances help build in silico ML models that can provide transformative insights from data including: molecular dynamics simulations, next-generation sequencing, omics, imaging, and unstructured clinical text documents. Unique challenges persist, however, in building ML models related to cancer, including: (1) access, sharing, labeling, and integration of multimodal and multi-institutional data across different cancer types; (2) developing AI models for cancer research capable of scaling on next generation high performance computers; and (3) assessing robustness and reliability in the AI models. In this paper, we review the National Cancer Institute (NCI) -Department of Energy (DOE) collaboration, Joint Design of Advanced Computing Solutions for Cancer (JDACS4C), a multi-institution collaborative effort focused on advancing computing and data technologies to accelerate cancer research on three levels: molecular, cellular, and population. This collaboration integrates various types of generated data, pre-exascale compute resources, and advancesmore » in ML models to increase understanding of basic cancer biology, identify promising new treatment options, predict outcomes, and eventually prescribe specialized treatments for patients with cancer.« less

Authors:
 [1];  [2];  [3];  [4];  [3];  [5];  [6];  [7];  [4];  [3];  [4];  [8];  [5];  [9];  [2];  [4]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  2. Argonne National Lab. (ANL), Lemont, IL (United States)
  3. National Cancer Inst., Bethesda, MD (United States)
  4. Frederick National Lab. of Cancer Research, Frederick, MD (United States)
  5. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  6. National Nuclear Security Administration (NNSA), Washington, DC (United States)
  7. Dept. of Energy (DOE), Washington DC (United States)
  8. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States)
  9. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States); Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA); National Institutes of Health (NIH) - National Cancer Institute; USDOE Exascale Computing Project; USDOE Office of Science (SC)
OSTI Identifier:
1637271
Alternate Identifier(s):
OSTI ID: 1657127; OSTI ID: 1821812
Report Number(s):
LA-UR-19-24131; LLNL-JRNL-773355
Journal ID: ISSN 2234-943X; 159254
Grant/Contract Number:  
AC02-06CH11357; AC52-07NA27344; 89233218CNA000001
Resource Type:
Accepted Manuscript
Journal Name:
Frontiers in Oncology
Additional Journal Information:
Journal Volume: 9; Journal ID: ISSN 2234-943X
Publisher:
Frontiers Research Foundation
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 60 APPLIED LIFE SCIENCES; artificial intelligence; cancer research; deep learning; high performance computing; multi-scale modeling; natural language processing; precision medicine; uncertainty quantification

Citation Formats

Bhattacharya, Tanmoy, Brettin, Thomas, Doroshow, James H., Evrard, Yvonne A., Greenspan, Emily J., Gryshuk, Amy L., Hoang, Thuc T., Lauzon, Carolyn B. Vea, Nissley, Dwight, Penberthy, Lynne, Stahlberg, Eric, Stevens, Rick, Streitz, Fred, Tourassi, Georgia, Xia, Fangfang, and Zaki, George. AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High Performance Computing. United States: N. p., 2019. Web. doi:10.3389/fonc.2019.00984.
Bhattacharya, Tanmoy, Brettin, Thomas, Doroshow, James H., Evrard, Yvonne A., Greenspan, Emily J., Gryshuk, Amy L., Hoang, Thuc T., Lauzon, Carolyn B. Vea, Nissley, Dwight, Penberthy, Lynne, Stahlberg, Eric, Stevens, Rick, Streitz, Fred, Tourassi, Georgia, Xia, Fangfang, & Zaki, George. AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High Performance Computing. United States. https://doi.org/10.3389/fonc.2019.00984
Bhattacharya, Tanmoy, Brettin, Thomas, Doroshow, James H., Evrard, Yvonne A., Greenspan, Emily J., Gryshuk, Amy L., Hoang, Thuc T., Lauzon, Carolyn B. Vea, Nissley, Dwight, Penberthy, Lynne, Stahlberg, Eric, Stevens, Rick, Streitz, Fred, Tourassi, Georgia, Xia, Fangfang, and Zaki, George. Wed . "AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High Performance Computing". United States. https://doi.org/10.3389/fonc.2019.00984. https://www.osti.gov/servlets/purl/1637271.
@article{osti_1637271,
title = {AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High Performance Computing},
author = {Bhattacharya, Tanmoy and Brettin, Thomas and Doroshow, James H. and Evrard, Yvonne A. and Greenspan, Emily J. and Gryshuk, Amy L. and Hoang, Thuc T. and Lauzon, Carolyn B. Vea and Nissley, Dwight and Penberthy, Lynne and Stahlberg, Eric and Stevens, Rick and Streitz, Fred and Tourassi, Georgia and Xia, Fangfang and Zaki, George},
abstractNote = {The application of data science in cancer research has been boosted by major advances in three primary areas: (1) Data: diversity, amount, and availability of biomedical data; (2) Advances in Artificial Intelligence (AI) and Machine Learning (ML) algorithms that enable learning from complex, large-scale data; and (3) Advances in computer architectures allowing unprecedented acceleration of simulation and machine learning algorithms. These advances help build in silico ML models that can provide transformative insights from data including: molecular dynamics simulations, next-generation sequencing, omics, imaging, and unstructured clinical text documents. Unique challenges persist, however, in building ML models related to cancer, including: (1) access, sharing, labeling, and integration of multimodal and multi-institutional data across different cancer types; (2) developing AI models for cancer research capable of scaling on next generation high performance computers; and (3) assessing robustness and reliability in the AI models. In this paper, we review the National Cancer Institute (NCI) -Department of Energy (DOE) collaboration, Joint Design of Advanced Computing Solutions for Cancer (JDACS4C), a multi-institution collaborative effort focused on advancing computing and data technologies to accelerate cancer research on three levels: molecular, cellular, and population. This collaboration integrates various types of generated data, pre-exascale compute resources, and advances in ML models to increase understanding of basic cancer biology, identify promising new treatment options, predict outcomes, and eventually prescribe specialized treatments for patients with cancer.},
doi = {10.3389/fonc.2019.00984},
journal = {Frontiers in Oncology},
number = ,
volume = 9,
place = {United States},
year = {Wed Oct 02 00:00:00 EDT 2019},
month = {Wed Oct 02 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 9 works
Citation information provided by
Web of Science

Figures / Tables:

FIGURE 1 FIGURE 1: Pilot 1 research aims, general workflow, and supporting data.

Save / Share:

Works referenced in this record:

Scalable deep text comprehension for Cancer surveillance on high-performance computing
journal, December 2018


The need for uncertainty quantification in machine-assisted medical decision making
journal, January 2019

  • Begoli, Edmon; Bhattacharya, Tanmoy; Kusnezov, Dimitri
  • Nature Machine Intelligence, Vol. 1, Issue 1
  • DOI: 10.1038/s42256-018-0004-1

A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles
journal, November 2017


A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer
conference, November 2019

  • Di Natale, Francesco; Bhatia, Harsh; Carpenter, Timothy S.
  • SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1145/3295500.3356197

The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
journal, March 2012

  • Barretina, Jordi; Caponigro, Giordano; Stransky, Nicolas
  • Nature, Vol. 483, Issue 7391
  • DOI: 10.1038/nature11003

An Interactive Resource to Identify Cancer Genetic and Lineage Dependencies Targeted by Small Molecules
journal, August 2013


Retrofitting Word Embeddings with the UMLS Metathesaurus for Clinical Information Extraction
conference, December 2018

  • Alawad, Mohammed; Hasan, S. M. Shamimul; Blair Christian, J.
  • 2018 IEEE International Conference on Big Data (Big Data)
  • DOI: 10.1109/BigData.2018.8621999

Methionine 170 is an Environmentally Sensitive Membrane Anchor in the Disordered HVR of K-Ras4B
journal, October 2018


Filter pruning of Convolutional Neural Networks for text classification: A case study of cancer pathology report comprehension
conference, March 2018

  • Yoon, Hong-Jun; Robinson, Sarah; Christian, J. Blair
  • 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)
  • DOI: 10.1109/BHI.2018.8333439

Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells
journal, November 2012

  • Yang, Wanjuan; Soares, Jorge; Greninger, Patricia
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1111

Computational Lipidomics of the Neuronal Plasma Membrane
journal, November 2017

  • Ingólfsson, Helgi I.; Carpenter, Timothy S.; Bhatia, Harsh
  • Biophysical Journal, Vol. 113, Issue 10
  • DOI: 10.1016/j.bpj.2017.10.017

Gene expression inference with deep learning
journal, February 2016


Capturing Phase Behavior of Ternary Lipid Mixtures with a Refined Martini Coarse-Grained Force Field
journal, September 2018

  • Carpenter, Timothy S.; López, Cesar A.; Neale, Chris
  • Journal of Chemical Theory and Computation, Vol. 14, Issue 11
  • DOI: 10.1021/acs.jctc.8b00496

A comprehensive transcriptional portrait of human cancer cell lines
journal, December 2014

  • Klijn, Christiaan; Durinck, Steffen; Stawiski, Eric W.
  • Nature Biotechnology, Vol. 33, Issue 3
  • DOI: 10.1038/nbt.3080

Predicting tumor cell line response to drug pairs with deep learning
journal, December 2018


CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research
journal, December 2018

  • Wozniak, Justin M.; Jain, Rajeev; Balaprakash, Prasanna
  • BMC Bioinformatics, Vol. 19, Issue S18
  • DOI: 10.1186/s12859-018-2508-4

CAT: computer aided triage improving upon the Bayes risk through ε-refusal triage rules
journal, December 2018

  • Hengartner, Nicolas; Cuellar, Leticia; Wu, Xiao-Cheng
  • BMC Bioinformatics, Vol. 19, Issue S18
  • DOI: 10.1186/s12859-018-2503-9

Hierarchical attention networks for information extraction from cancer pathology reports
journal, November 2017

  • Gao, Shang; Young, Michael T.; Qiu, John X.
  • Journal of the American Medical Informatics Association, Vol. 25, Issue 3
  • DOI: 10.1093/jamia/ocx131

Introducing Heuristic Information into Ant Colony Optimization Algorithm for Identifying Epistasis
journal, January 2019

  • Sun, Yingxia; Wang, Xuan; Shang, Junliang
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • DOI: 10.1109/TCBB.2018.2879673

The COXEN Principle: Translating Signatures of In vitro Chemosensitivity into Tools for Clinical Outcome Prediction and Drug Discovery in Cancer
journal, February 2010


Deep learning-based transcriptome data classification for drug-target interaction prediction
journal, September 2018


RAS Proteins and Their Regulators in Human Disease
journal, June 2017


Molecular recognition of RAS/RAF complex at the membrane: Role of RAF cysteine-rich domain
journal, May 2018


Coarse-to-fine multi-task training of convolutional neural networks for automated information extraction from cancer pathology reports
conference, March 2018

  • Alawad, Mohammed; Yoon, Hong-Jun; Tourassi, Georgia D.
  • 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)
  • DOI: 10.1109/BHI.2018.8333408

Computational Lipidomics of the Neuronal Plasma Membrane
journal, November 2017

  • Ingólfsson, Helgi I.; Carpenter, Timothy S.; Bhatia, Harsh
  • Biophysical Journal, Vol. 113, Issue 10
  • DOI: 10.1016/j.bpj.2017.10.017

RAS Proteins and Their Regulators in Human Disease
journal, June 2017


A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles
journal, November 2017


Methionine 170 is an Environmentally Sensitive Membrane Anchor in the Disordered HVR of K-Ras4B
journal, October 2018


The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
journal, March 2012

  • Barretina, Jordi; Caponigro, Giordano; Stransky, Nicolas
  • Nature, Vol. 483, Issue 7391
  • DOI: 10.1038/nature11003

A comprehensive transcriptional portrait of human cancer cell lines
journal, December 2014

  • Klijn, Christiaan; Durinck, Steffen; Stawiski, Eric W.
  • Nature Biotechnology, Vol. 33, Issue 3
  • DOI: 10.1038/nbt.3080

The need for uncertainty quantification in machine-assisted medical decision making
journal, January 2019

  • Begoli, Edmon; Bhattacharya, Tanmoy; Kusnezov, Dimitri
  • Nature Machine Intelligence, Vol. 1, Issue 1
  • DOI: 10.1038/s42256-018-0004-1

Hierarchical attention networks for information extraction from cancer pathology reports
journal, November 2017

  • Gao, Shang; Young, Michael T.; Qiu, John X.
  • Journal of the American Medical Informatics Association, Vol. 25, Issue 3
  • DOI: 10.1093/jamia/ocx131

Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells
journal, November 2012

  • Yang, Wanjuan; Soares, Jorge; Greninger, Patricia
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1111

Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports
journal, January 2018

  • Qiu, John X.; Yoon, Hong-Jun; Fearn, Paul A.
  • IEEE Journal of Biomedical and Health Informatics, Vol. 22, Issue 1
  • DOI: 10.1109/jbhi.2017.2700722

The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity
journal, April 2017


CAT: computer aided triage improving upon the Bayes risk through ε-refusal triage rules
journal, December 2018

  • Hengartner, Nicolas; Cuellar, Leticia; Wu, Xiao-Cheng
  • BMC Bioinformatics, Vol. 19, Issue S18
  • DOI: 10.1186/s12859-018-2503-9

CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research
journal, December 2018

  • Wozniak, Justin M.; Jain, Rajeev; Balaprakash, Prasanna
  • BMC Bioinformatics, Vol. 19, Issue S18
  • DOI: 10.1186/s12859-018-2508-4

Predicting tumor cell line response to drug pairs with deep learning
journal, December 2018


Scalable deep text comprehension for Cancer surveillance on high-performance computing
journal, December 2018


Combating Label Noise in Deep Learning Using Abstention
preprint, January 2019


Figures / Tables found in this record:

    Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.