Choosing the best partition of the output from a largescale simulation
Abstract
Data partitioning becomes necessary when a largescale simulation produces more data than can be feasibly stored. The goal is to partition the data, typically so that every element belongs to one and only one partition, and store summary information about the partition, either a representative value plus an estimate of the error or a distribution. Once the partitions are determined and the summary information stored, the raw data is discarded. This process can be performed insitu; meaning while the simulation is running. When creating the partitions there are many decisions that researchers must make. For instance, how to determine once an adequate number of partitions have been created, how are the partitions created with respect to dividing the data, or how many variables should be considered simultaneously. In addition, decisions must be made for how to summarize the information within each partition. Because of the combinatorial number of possible ways to partition and summarize the data, a method of comparing the different possibilities will help guide researchers into choosing a good partitioning and summarization scheme for their application.
 Authors:
 Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
 Publication Date:
 Research Org.:
 Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
 Sponsoring Org.:
 USDOE Office of Science (SC). Advanced Scientific Computing Research (ASCR) (SC21)
 OSTI Identifier:
 1396090
 Report Number(s):
 LAUR1728730
 DOE Contract Number:
 AC5206NA25396
 Resource Type:
 Technical Report
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING
Citation Formats
Challacombe, Chelsea Jordan, and Casleton, Emily Michele. Choosing the best partition of the output from a largescale simulation. United States: N. p., 2017.
