skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets

Abstract

Machine learning has gained widespread attention as a powerful tool to identify structure in complex, high-dimensional data. However, these techniques are ostensibly inapplicable for experimental systems where data are scarce or expensive to obtain. Here, we introduce a strategy to resolve this impasse by augmenting the experimental dataset with synthetically generated data of a much simpler sister system. Specifically, we study spontaneously emerging local order in crease networks of crumpled thin sheets, a paradigmatic example of spatial complexity, and show that machine learning techniques can be effective even in a data-limited regime. This is achieved by augmenting the scarce experimental dataset with inexhaustible amounts of simulated data of rigid flat-folded sheets, which are simple to simulate and share common statistical properties. This considerably improves the predictive power in a test problem of pattern completion and demonstrates the usefulness of machine learning in bench-top experiments where data are good but scarce.

Authors:
 [1]; ORCiD logo [1];  [1]; ORCiD logo [1];  [1]; ORCiD logo [1]; ORCiD logo [2]
  1. Harvard Univ., Cambridge, MA (United States). John A. Paulson School of Engineering
  2. Harvard Univ., Cambridge, MA (United States). John A. Paulson School of Engineering; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Div.
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); National Science Foundation (NSF)
OSTI Identifier:
1526588
Grant/Contract Number:  
AC02-05CH11231; DMR-1420570
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Science Advances
Additional Journal Information:
Journal Volume: 5; Journal Issue: 4; Journal ID: ISSN 2375-2548
Publisher:
AAAS
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Hoffmann, Jordan, Bar-Sinai, Yohai, Lee, Lisa M., Andrejevic, Jovana, Mishra, Shruti, Rubinstein, Shmuel M., and Rycroft, Chris H. Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets. United States: N. p., 2019. Web. doi:10.1126/sciadv.aau6792.
Hoffmann, Jordan, Bar-Sinai, Yohai, Lee, Lisa M., Andrejevic, Jovana, Mishra, Shruti, Rubinstein, Shmuel M., & Rycroft, Chris H. Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets. United States. doi:10.1126/sciadv.aau6792.
Hoffmann, Jordan, Bar-Sinai, Yohai, Lee, Lisa M., Andrejevic, Jovana, Mishra, Shruti, Rubinstein, Shmuel M., and Rycroft, Chris H. Fri . "Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets". United States. doi:10.1126/sciadv.aau6792. https://www.osti.gov/servlets/purl/1526588.
@article{osti_1526588,
title = {Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets},
author = {Hoffmann, Jordan and Bar-Sinai, Yohai and Lee, Lisa M. and Andrejevic, Jovana and Mishra, Shruti and Rubinstein, Shmuel M. and Rycroft, Chris H.},
abstractNote = {Machine learning has gained widespread attention as a powerful tool to identify structure in complex, high-dimensional data. However, these techniques are ostensibly inapplicable for experimental systems where data are scarce or expensive to obtain. Here, we introduce a strategy to resolve this impasse by augmenting the experimental dataset with synthetically generated data of a much simpler sister system. Specifically, we study spontaneously emerging local order in crease networks of crumpled thin sheets, a paradigmatic example of spatial complexity, and show that machine learning techniques can be effective even in a data-limited regime. This is achieved by augmenting the scarce experimental dataset with inexhaustible amounts of simulated data of rigid flat-folded sheets, which are simple to simulate and share common statistical properties. This considerably improves the predictive power in a test problem of pattern completion and demonstrates the usefulness of machine learning in bench-top experiments where data are good but scarce.},
doi = {10.1126/sciadv.aau6792},
journal = {Science Advances},
issn = {2375-2548},
number = 4,
volume = 5,
place = {United States},
year = {2019},
month = {4}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: