Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

DLHub: Simplifying publication, discovery, and use of machine learning models in science

Journal Article · · Journal of Parallel and Distributed Computing
 [1];  [2];  [2];  [3];  [2];  [4];  [2];  [4];  [4];  [2];  [3]
  1. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Argonne National Lab. (ANL), Argonne, IL (United States)
  3. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States). Globus
  4. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States). Globus
Machine Learning (ML) has become a critical tool enabling new methods of analysis and driving deeper understanding of phenomena across scientific disciplines. There is a growing need for "learning systems" to support various phases in the ML lifecycle. While others have focused on supporting model development, training, and inference, few have focused on the unique challenges inherent in science, such as the need to publish and share models and to serve them on a range of available computing resources. In this paper, we present the Data and Learning Hub for science (DLHub), a learning system designed to support these use cases. Specifically, DLHub enables publication of models, with descriptive metadata, persistent identifiers, and flexible access control. It packages arbitrary models into portable servable containers, and enables low-latency, distributed serving of these models on heterogeneous compute resources. In this work, we show that DLHub supports low-latency model inference comparable to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, and improved performance, by up to 95%, with batching and memoization enabled. We also show that DLHub can scale to concurrently serve models on 500 containers. Finally, we describe five case studies that highlight the use of DLHub for scientific applications.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
Defense Advanced Research Projects Agency (DARPA); National Science Foundation (NSF); USDOE; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Fossil Energy (FE)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1837199
Alternate ID(s):
OSTI ID: 1811041
Journal Information:
Journal of Parallel and Distributed Computing, Journal Name: Journal of Parallel and Distributed Computing Vol. 147; ISSN 0743-7315
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (21)

The Materials Data Facility: Data Services to Advance Materials Science Research journal July 2016
OCPMDM: Online computation platform for materials data mining journal June 2018
Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis journal February 2013
AFLOW-ML: A RESTful API for machine-learning predictions of materials properties journal September 2018
Matminer: An open source toolkit for materials data mining journal September 2018
Machine learning applications in cancer prognosis and prediction journal January 2015
An online tool for predicting fatigue strength of steel alloys based on ensemble data mining journal August 2018
“Memo” Functions and Machine Learning journal April 1968
1,500 scientists lift the lid on reproducibility journal May 2016
Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity journal January 2018
The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies journal December 2015
A general-purpose machine learning framework for predicting properties of inorganic materials journal August 2016
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
Adaptive enhanced sampling by force-biasing using neural networks journal April 2018
Deep learning to represent subgrid processes in climate models journal September 2018
XSEDE: Accelerating Scientific Discovery journal September 2014
Shining Light into Black Boxes journal April 2012
Enhancing reproducibility for computational methods journal December 2016
Reproducible big data science: A case study in continuous FAIRness journal April 2019
A data ecosystem to support machine learning in materials science journal October 2019
Applying Artificial Intelligence to Address the Knowledge Gaps in Cancer Care journal November 2018

Similar Records

A data ecosystem to support machine learning in materials science
Journal Article · 2019 · MRS Communications · OSTI ID:1607645

ScienceSearch: Enabling Search through Automatic Metadata Generation
Conference · 2018 · OSTI ID:1602828