Choosing the best partition of the output from a large-scale simulation
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Data partitioning becomes necessary when a large-scale simulation produces more data than can be feasibly stored. The goal is to partition the data, typically so that every element belongs to one and only one partition, and store summary information about the partition, either a representative value plus an estimate of the error or a distribution. Once the partitions are determined and the summary information stored, the raw data is discarded. This process can be performed in-situ; meaning while the simulation is running. When creating the partitions there are many decisions that researchers must make. For instance, how to determine once an adequate number of partitions have been created, how are the partitions created with respect to dividing the data, or how many variables should be considered simultaneously. In addition, decisions must be made for how to summarize the information within each partition. Because of the combinatorial number of possible ways to partition and summarize the data, a method of comparing the different possibilities will help guide researchers into choosing a good partitioning and summarization scheme for their application.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC). Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1396090
- Report Number(s):
- LA-UR-17-28730
- Country of Publication:
- United States
- Language:
- English
Similar Records
PipeSight: A High-Performance Computing Platform for Pipeline Integrity Management
Used fuel disposition campaign international activities implementation plan.